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I, L. MICHAEL FURNESS, a citizen of the United Kingdom, residing at 2 Brookside, Exning, 
Newmarket, United Kingdom, declare that: 

1. I was employed by Incyte Genomics, Inc. (hereinafter "Incyte") as a Director of 
Pharmacogenomics until December 31, 2001. I am currently under contract to be a Consultant to 
Incyte. 

2. In 1984, I received a B.Sc.(Hons) in Biomolecular Science (Biophysics and Biochemistry) 
from Portsmouth Polytechnic. 

From 1985-1987 I was at the School of Pharmacy in London, United Kingdom, during which 
time I analyzed lipid methyltransferase enzymes using a variety of protein analysis methods, including 
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one-dimensional (ID) and two-dimensional (2D) gel electrophoresis, HPLC, and a variety of enzymatic 
assay systems. 

I then worked in the Protein Structure group at the National Institute for Medical Research until 
1989, setting up core facilities for nucleic acid synthesis and sequencing, as well as assisting in programs 
on protein kinase C inhibitors. 

After a year at Perkin Elmer- Applied Biosystems as a technical specialist, I worked at the 
Imperial Cancer Research Fund between 1990-1992, on a Eureka-funded program collaborating with 
Amersham Pharmacia in the United Kingdom and CEPH (Centre d'Etude du Polymorphisme 
Humaine) in Paris, France, to develop novel nucleic acid purification and characterization methods. 

In 1992, 1 moved to Pfizer Central Research in the United Kingdom, where I stayed until 1998, 
initially setting up core DNA sequencing and then a DNA arraying facility for gene expression analysis 
in 1993. My work also included bioinformatics and I was responsible for the support of all Pfizer 
neuroscience programs in the United Kingdom. This then led me into carrying out detailed 
bioinformatics and wet lab work on the sodium channels, including antibody generation, Western and 
Northern analyses, PCR, tissue distribution studies, and sequence analyses on novel sequences 
identified. 

In 1998, 1 moved to Incyte to work in the Pharmacogenomics group, looking at the application 
of genomics and proteomics to the pharmaceutical industry. In 1999, 1 was appointed Director of the 
LifeExpress Lead Program which used microarray and protein expression data to identify 
pharmacologically and toxicologically relevant mechanisms to assist in improved drug design and 
development. 

On December 12, 2001, 1 founded Nuomics Consulting, Ltd., in Exning, UK, where I am 
currently employed as Managing Director. Nuomics Consulting, Ltd. provides expert technical 
knowledge and advice to businesses in the areas of genomics, proteomics, pharmacogenomics, 
toxicogenomics, and chemogenomics. 

3. I have reviewed the specification of a United States patent application that I understand was 
filed on September 16, 1999 in the names of Preeti Lai et al. and was assigned Serial No. 09/397,558 
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(hereinafter "the Lai '558 application"). Furthermore, I understand that this United States patent 
application was a divisional application of, and claimed priority to, United States patent application 
Serial No. 09/083,521, filed on May 22, 1998 (hereinafter "the Lai '521 application"), having the 
identical specification. My remarks herein will therefore be directed to the Lai '521 patent application, 
and May 22, 1998, as the relevant date of filing. In broad overview, the Lai '521 specification pertains 
to certain nucleotide and amino acid sequences and their use in a number of applications, including gene 
and protein expression monitoring applications that are useful in connection with (a) developing drugs 
(e.g., for the treatment of cancer), and (b) monitoring the activity of drugs for purposes relating to 
evaluating their efficacy and toxicity. 

4. I understand that (a) the Lai '558 application contains claims that are directed to isolated 
polypeptides having either of the sequences shown as SEQ ID NO:l and SEQ ID NO:2 (hereinafter 
"the SEQ ID NO:l and SEQ ID NO:2 polypeptides"), and (b) the Patent Examiner has rejected those 
claims on the grounds that the specification of the Lai '558 application does not disclose a specific and 
substantial asserted utility or a well established utility for the claimed SEQ ID NO:l and SEQ ED NO:2 
polypeptides. I further understand that whether or not a patent specification discloses a specific and 
substantial asserted utility or a well established utility for its claimed subject matter is properly 
determined from the perspective of a person skilled in the art to which the specification pertains at the 
time the patent application was filed. In addition, I understand that a specific and substantial asserted 
utility or a well established utility under the patent laws must be a "real-world" utility. 

5. I have been asked (a) to consider with a view to reaching a conclusion (or conclusions) as 
to whether or not I agree with the Patent Examiner's position that the Lai '558 application and its 
parent, the Lai '521 application, do not disclose a specific and substantial "real-world" utility for the 
claimed SEQ ID NO:l and SEQ ID NO:2 polypeptides, and (b) to state and explain the bases for any 
conclusions I reach. I have been informed that, in connection with my considerations, I should 
determine whether or not a person skilled in the art to which the Lai '521 application pertains on May 
22, 1998, would have concluded that the Lai '521 application disclosed, for the benefit of the public, a 
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specific beneficial use of the SEQ ID NO:l and SEQ ED NO:2 polypeptides in their then available and 

disclosed forms. I have also been informed that, with respect to the "real-world" utility requirement, the 

Patent and Trademark Office instructs its Patent Examiners in Section 2107 of the Manual of Patent 

Examining Procedure, under the heading "I. 'Real-World Value' Requirement": 

"Many research tools such as gas chromatographs, screening assays, and 
nucleotide sequencing techniques have a clear, specific and unquestionable utility (e.g., 
they are useful in analyzing compounds). An assessment that focuses on whether an 
invention is useful only in a research setting thus does not address whether the specific 
invention is in fact 'useful' in a patent sense. Instead, Office personnel must distinguish 
between inventions that have a specifically identified substantial utility and inventions 
whose asserted utility requires further research to identify or reasonably confirm." 

6. I have considered the matters set forth in paragraph 5 of this Declaration and have 
concluded that, contrary to the position I understand the Patent Examiner has taken, the specification of 
the Lai '521 patent application disclosed to a person skilled in the art at the time of its filing a number of 
specific and substantial real-world utilities for the claimed SEQ ID NO: 1 and SEQ ID NO:2 
polypeptides. More specifically, persons skilled in the art on May 22, 1998, would have understood 

the Lai '521 application to disclose the use of the SEQ ID NO:l and SEQ ID NO:2 polypeptides as 
research tools in a number of gene and protein expression monitoring applications that were well- 
known at that time to be useful in connection with the development of drugs and the monitoring of the 
activity of such drugs. I explain the bases for reaching my conclusion in this regard in paragraphs 7-13 
below. 

7. In reaching the conclusion stated in paragraph 6 of this Declaration, I considered (a) the 
specification of the Lai '521 application, and (b) a number of published articles and patent documents 
that evidence gene and protein expression monitoring techniques that were well-known before the May 
22, 1998 filing date of the Lai '521 application. The published articles and patent documents I 
considered are: 

(a) Anderson, N.L., Esquer-Blasco, R., Hofmann, J.-P., Anderson, N.G., ATwo- 
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Dimensional Gel Database of Rat Liver Proteins Useful in Gene Regulation and Drug Effects Stnriiwc , 
Electrophoresis, 12, 907-930 (1991) (hereinafter "the Anderson 1991 article") (copy annexed at Tab 
A); 

(b) Anderson, N.L., Esquer-Blasco, R., Hofmann, J.-P., Mehues, L., Raymackers, J., Steiner, 
S., Witzmann, F., Anderson, N.G., An Updated Two-Dimensional Gel Database of Rat Liver Proteins 
Useful in Gene Regulati on and Drug Effect Studies . Electrophoresis, 16, 1977-1991 (1995) 
(hereinafter "the Anderson 1995 article") (copy annexed at Tab B); 

(c) Wilkins, M.R., Sanchez, J.-C, Gooley, A.A., Appel, R.D., Humphrey-Smith, I., 
Hochstrasser, D.F., Williams, K.L., Progress with Proteome Projects: Whv all Proteins Expressed bv a 
Genome Should be Iden tified and How To Do It . Biotechnology and Genetic Engineering Reviews, 13, 
19-50 (1995) (hereinafter "the Wilkins article") (copy annexed at Tab C); 

(d) Celis, J.E., Rasmussen, H.H., Leffers, H., Madsen, P., Honore, B., Gesser, B., Dejgaard, 
K., Vandekerckhove, J., Human Cell ular Protein Patterns and their Link to Genome DNA Sequence 
Data: Usefulness of Two-Dimention al Gel Electrophoresis and Microsequencing . FASEB Journal, 5, 
2200-2208 (1991) (hereinafter "the Celis article") (copy annexed at Tab D); 

(e) Franzen, B., Linder, S., Okuzawa, K., Kato, H., Auer, G., Nonenzvmatic Extraction of 
Cells from Clinical Tumor Material for Analysis of Gene Expression bv Two-Dimensional 
Polvacrvlamide Gel Electrophoresis Electrophoresis, 14, 1045-1053 (1993) (hereinafter "the Franzen 
article") (copy annexed at Tab E); 

(f) Bjellqvist, B., Basse, B., Olsen, E., Celis, J.E., Reference Points for Comparisons of Two- 
Dimensional Maps of Proteins from Different Human Cell Types Defined in a pH Scale Where 
Isoelectric Points Corre late with Polypeptide Compositions . Electrophoresis, 15, 529-539 (1994) 
(hereinafter "the Bjellqvist article") (copy annexed at Tab F); and 

(g) Large Scale Biology Company Info; LSB and LSP Information; from http://www.lsbc.com 
(2001) (copy annexed at Tab G). 

8. Many of the published articles I considered (i.e., at least items (a)-(f) identified in paragraph 
7) relate to the development of protein two-dimensional gel electrophoretic techniques for use in gene 
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and protein expression monitoring applications in drug development and toxicology. As I will discuss 
below, a person skilled in the art who read the Lai c 521 application on May 22, 1998 would have 
understood that application to disclose the SEQ ID NO:l and SEQ ID NO:2 polypeptides to be useful 
for a number of gene and protein expression monitoring applications, e.g., in the use of two-dimensional 
polyacrylamide gel electrophoresis and western blot analysis of tissue samples in drug development and 
in toxicity testing. 

9. Turning more specifically to the Lai '521 specification, the SEQ ID NO:l and SEQ ID 
NO:2 polypeptides are shown at pages 51-53 as two of seven sequences under the heading "Sequence 
Listing." The Lai '521 specification specifically teaches that the "invention features substantially purified 
polypeptides, prostate growth-associated membrane proteins, referred to collectively as TGAMP' and 
individually as TGAMP- 1' and TGAMP-2.' In one aspect, the invention provides a substantially 
purified polypeptide comprising an amino acid sequence selected from the group consisting of SEQ ID 
NO:l, SEQ ID NO:2, a fragment of SEQ ID NO:l, and a fragment of SEQ ID NO:2." (Lai '521 
application at page 3, lines 5-9, as amended). With respect to SEQ ID NO:l, the Lai 521 
specification teaches that (a) the identity of the SEQ ID NO:l polypeptide was determined from a 
"prostate cDNA library", (b) the SEQ ID NO:l polypeptide is the human prostate growth-associated 
membrane protein referred to as "PGAMP-1" and is encoded by SEQ ED NO:3, and (c) northern 
analysis shows that PGAMP-1 is expressed "in various libraries, at least 72% of which are 
immortalized or cancerous and at least 18% of which invlove immune response. Of particular note is 
the expression of PGAMP-1 in cancerous or hyperplastic prostate (48%) and breast (7%)" tissues and 
therefore PGAMP-1 "appears to play a role in neoplastic and reproductive disorders" (Lai '521 
application at page 13, lines 27-32; page 14, lines 10-13; and page 25, lines 15-17). With respect to 
SEQ ED NO:2, the Lai '521 specification teaches that (a) the identity of the SEQ ED NO:2 polypeptide 
was determined from a "breast cDNA library", (b) the SEQ ID NO:2 polypeptide is the human 
prostate growth-associated membrane protein referred to as "PGAMP-2" and is encoded by SEQ ID 
NO:4, and (c) northern analysis shows that PGAMP-2 is expressed "in various libraries, at least 76% 
of which are immortalized or cancerous and at least 18% of which invlove immune response. Of 
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particular note is the expression of PGAMP-2 in cancerous or hyperplastic prostate (28%) and breast 
(10%)" tissues and therefore PGAMP-2 "appears to play a role in neoplastic and reproductive 
disorders" (Lai '521 application at page 14, lines 14-19; page 15, lines 4-8; and page 25, lines 20-22). 

The Lai '521 application discusses a number of uses of the SEQ ID NO:l and SEQ ID NO:2 
polypeptides in addition to their use in gene and protein expression monitoring applications. I have not 
fully evaluated these additional uses in connection with the preparation of this Declaration and do not 
express any views in this Declaration regarding whether or not the Lai '521 specification discloses these 
additional uses to be substantial, specific and credible real-world utilities of the SEQ ID NO:l and 
SEQ ID NO:2 polypeptides. Consequently, my discussion in this Declaration concerning the Lai '521 
application focuses on the portions of the application that relate to the use of the SEQ ID NO:l and 
SEQ ID NO:2 polypeptides in gene and protein expression monitoring applications. 

10. The Lai '521 application discloses that the polynucleotide sequences disclosed therein, 
including the polynucleotides encoding the SEQ ID NO:l and SEQ ID NO:2 polypeptides, are useful 
as probes in chip based technologies. It further teaches that the chip based technologies can be used 
"for the detection and/or quantification of nucleic acid or protein" (Lai '521 application at page 23, lines 
5-8). 

The Lai '521 application also discloses that the SEQ ID NO:l and SEQ ID NO:2 
polypeptides are useful in other protein expression detection technologies. The Lai '521 application 
states that "[I]mmunological methods for detecting and measuring the expression of PGAMP using 
either specific polyclonal or monoclonal antibodies are known in the art. Examples of such techniques 
include enzyme-linked immunosorbent assays (ELIS As), radioimmunoassays (RIAs), and fluorescence 
activated cell sorting (FACS)" (Lai '521 application at page 23, lines 9-12). Furthermore, the Lai 
'521 application discloses that "[a] variety of protocols for measuring PGAMP, including ELIS As, 
RIAs, and FACS, are known in the art and provide a basis for diagnosing altered or abnormal levels of 
PGAMP expression. Normal or standard values for PGAMP expression are established by combining 
body fluids or cell extracts taken from normal mammalian subjects, preferably human, with antibody to 
PGAMP under conditions suitable for complex formation" (Lai '521 application at page 34, lines 2-6). 
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In addition, at the time of filing the Lai '521 application, it was well known in the art that "gene" 
and protein expression analyses also included two-dimensional polyacrylamide gel electrophoresis (2-D 
PAGE) technologies, which were developed during the 1980s, as exemplified by the Anderson 1991 
and 1995 articles (Tab A and Tab B). The Anderson 1991 article teaches that a 2-D PAGE map has 
been used to connect and compare hundreds of 2-D gels of rat liver samples from a variety of studies 
including regulation of protein expression by various drugs and toxic agents (Tab A at p. 907). The 
Anderson 1991 article teaches an empirically-determined standard curve fitted to a series of identified 
proteins based upon amino acid chain length, and how that standard curve can be used in protein 
expression analysis (Tab A at p. 91 1). The Anderson 1991 article teaches that "there is a long-term 
need for a comprehensive database of liver proteins" (Tab A at p. 912). 

The Wilkins article is one of a number of documents that were published prior to the May 22, 
1998 filing date of the Lai '521 application that describes the use of the 2-D PAGE technology in a 
wide range of gene and protein expression monitoring applications, including monitoring and analyzing 
protein expression patterns in human cancer, human serum plasma proteins, and in rodent liver 
following exposure to toxins. In view of the Lai '521 application, the Wilkins article, and other related 
pre-May 1998 publications, persons skilled in the art on May 22, 1998 clearly would have understood 
the Lai '521 application to disclose the SEQ ID NO:l and SEQ ID NO:2 polypeptides to be useful in 
2-D PAGE analyses for the development of new drugs and for monitoring the activities of drugs for 
such purposes as evaluating their efficacy and toxicity, as explained more fully in paragraph 12 below. 

With specific reference to toxicity evaluations, those of skill in the art who were working on 
drug development in May 1998 (and for many years prior to May 1998) without any doubt 
appreciated that the toxicity (or lack of toxicity) of any proposed drug they were working on was one 
of the most important criteria to be considered and evaluated in connection with the development of the 
drug. They would have understood at that time that good drugs are not only potent, they are specific. 
This means that they have strong effects on a specific biological target and minimal effects on all other 
biological targets. Ascertaining that a candidate drug affects its intended target, and identifying 
undesirable secondary effects (i.e., toxic side effects), had been for many years among the main 
challenges in developing new drugs. The ability to determine which genes are positively affected by a 
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given drug, coupled with the ability to quickly and at the earliest time possible in the drug development 
process identify drugs that are likely to be toxic because of their undesirable secondary effects, have 
enormous value in improving the efficiency of the drug discovery process, and are an important and 
essential part of the development of any new drug. In fact, the desire to identify and understand 
toxicological effects using the experimental assays described above led Dr Leigh Anderson to found the 
Large Scale Biology Corporation in 1987, in order to pursue commercial development of the 2-D 
electrophoretic protein mapping technology he had developed. In addition, the company focused on 
toxicological effects on the proteome as clearly demonstrated by its goals and by its senior management 
credentials described in company documents (see Tab G at pp. 1, 3, and 5). 

Accordingly, the teachings in the Lai '521 application, in particular regarding use of the SEQ ID 
NO:l and SEQ ID NO:2 polypeptides in differential gene and protein expression analysis (2-D PAGE 
maps) and in the development and the monitoring of the activities of drugs, clearly includes toxicity 
studies, and persons skilled in the art who read the Lai '521 application on May 22, 1998 would have 
understood that to be so. 

11. As previously discussed (supra, paragraphs 7 and 8), in the mid-1980s the several 
publications annexed to this Declaration at Tabs A through F evidence information that was available to 
the public regarding two-dimensional polyacrylamide gel electrophoresis technology and its uses in drug 
discovery and toxicology testing before the May 22, 1998 filing date of the Lai '521 application. In 
particular the Celis article stated that "protein databases are expected to foster a variety of biological 
information... - among others, ... drug development and testing" (See Tab D, p. 2200, second 
column). The Franzen article shows that 2-D PAGE maps were used to identify proteins in clinical 
tumor material (See Tab E). The Lai '521 application clearly discloses that expression of PGAMP-1 
and/or PGAMP-2 is associated with immortalized cell lines, cancerous and hyperplastic prostate and 
breast tissue, and with the immune response (Lai '521 application at page 14, lines 10-13; page 15, 
lines 4-8; and page 25, lines 15-16 and 20-21). The Bjellqvist article showed that a protein may be 
identified accurately by its positional coordinates, namely molecular mass and isoelectric point (See Tab 
F). The Lai '521 application clearly disclosed SEQ ID NO:l and SEQ ID NO:2 from which it would 
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have been routine for one of skill in the art to predict both the molecular mass and the isoelectric j 
using algorithms well known in the art at the time of filing. 



12. A person skilled in the art on May 22, 1998 who read the Lai '521 application, would 
understand that application to disclose the SEQ ID NO:l and SEQ ID NO:2 polypeptides to be highly 
useful in analysis of differential expression of proteins. For example, the specification of the Lai '521 
application would have led a person skilled in the art in May 1998, who was using protein expression 
monitoring in connection with developing new drugs for the treatment of a neoplastic or reproductive 
disorder to conclude that a 2-D PAGE map that used the substantially purified SEQ ID NO.l and SEQ 
ID NO:2 polypeptides would be a highly useful tool and to request specifically that any 2-D PAGE 
map that was being used for such purposes utilize the SEQ ID NO:l and/or SEQ ID NO:2 
polypeptides. Expressed proteins are useful for 2-D PAGE analysis in toxicology expression studies 
for a variety of reasons, particularly for purposes relating to providing controls for the 2-D PAGE 
analysis, and for identifying sequence or post-translational variants of the expressed sequences in 
response to exogenous compounds. Persons skilled in the art would appreciate that a 2-D PAGE map 
that utilized the SEQ ID NO. l and SEQ ID NO:2 polypeptide sequences would be a more useful tool 
than a 2-D PAGE map that did not utilize these protein sequences in connection with conducting 
protein expression monitoring studies on proposed (or actual) drugs for treating neoplastic and 
reproductive disorders for such purposes as evaluating their efficacy and toxicity. 

I discuss in more detail in items (a)-(b) below a number of reasons why a person skilled in the 
art, who read the Lai '521 specification in May 1998, would have concluded based on that 
specification and the state of the art at that time, that the SEQ ID NO:l and SEQ ED NO:2 
polypeptides would be highly useful tools for analysis of a 2-D PAGE map for evaluating the efficacy 
and toxicity of proposed drugs for neoplastic and reproductive disorders by means of 2-D PAGE 
maps, as well as for other evaluations. 

(a) The Lai '521 specification contains a number of teachings that would lead persons 
skilled in the art on May 22, 1998 to conclude that a 2-D PAGE map that utilized the substantially 
purified SEQ ID NO: 1 and/or SEQ ID NO:2 polypeptides would be a more useful tool for gene and 
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protein expression monitoring applications relating to drugs for treating neoplastic and reproductive 
disorders than a 2-D PAGE map that did not use the SEQ ED NO:l and/or SEQ ID NO:2 
polypeptides. Among other things, the Lai '521 specification teaches that (i) the identity of the SEQ ID 
NO:l polypeptide was determined from a prostate cDNA library, (ii) the SEQ ID NO:l polypeptide is 
the prostate growth-associated membrane protein referred to as PGAMP-1, and (iii) PGAMP-1 is 
expressed in various libraries derived from immortalized and cancerous tissues, cancerous or 
hyperplastic prostate and breast tissues, and tissues involved in the immune response, and, therefore, 
PGAMP-1 expression is "associated with neoplastic and reproductive disorders" (Lai '521 application 
at page 13, lines 27-32; page 14, lines 10-13; and page 25, lines 15-17; see paragraph 9, supra). 
Furthermore, the Lai '521 specification teaches that (i) the identity of the SEQ ID NO:2 polypeptide 
was determined from a breast cDNA library, (ii) the SEQ ID NO:2 polypeptide is the prostate growth- 
associated membrane protein referred to as PGAMP-2, and (iii) PGAMP-2 is expressed in various 
libraries derived from immortalized and cancerous tissues, cancerous or hyperplastic prostate and 
breast tissues, and tissues involved in the immune response, and, therefore, PGAMP-2 expression is 
"associated with neoplastic and reproductive disorders" (Lai '521 application at page 14, lines 14-19; 
page 15, lines 4-8; and page 25, lines 20-22; see paragraph 9, supra). The substantially purified SEQ 
ID NO:l and SEQ ID NO:2 polypeptides could, therefore, be used as controls to more accurately 
gauge the expression of PGAMP in a sample, and consequently more accurately gauge the effect of a 
toxicant on expression of the gene. 

Moreover, the Lai '521 specification teaches that SEQ ID NO:l and SEQ ID NO:2 
share chemical and structural homology with known tumor-associated antigens. PGAMP-1 shares 
chemical and structural homology with rat heat-stable antigen CD4. These polypeptides share 21% 
identity and two potential transmembrane domains (Lai '521 application at page 14, lines 6-8; and 
Figure 1). In addition, PGAMP-1 has chemical similarity with CD44 antigen precursor (Lai '521 
application at page 14, lines 1-5). PGAMP-2 shares chemical and structural homology with human 
prostate-specific antigen and a fragment of the mouse apoptosis-associated tyrosine kinase, sharing 
18% and 17% identity, respectively (Lai '521 application at page 14, lines 29-33; and Figures 2A, 2B, 
and 2C). In addition, all three of these proteins share six potential transmembrane regions and a 
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potential signal peptide, and PGAMP-2 and human prostate-specific antigen have similar isoelectric 
points (Lai '521 application at page 15, lines 1-2). 

(b) Persons skilled in the art on May 22, 1998 would have appreciated (i) that the 
protein expression monitoring results obtained using a 2-D PAGE map that utilized the SEQ ID NO: 1 
and/or SEQ ID NO:2 polypeptides would vary, depending on the particular drug being evaluated, and 
(ii) that such varying results would occur both with respect to the results obtained from the SEQ ID 
NO:l and/or SEQ ID NO:2 polypeptides and from the 2-D PAGE map as a whole (including all its 
other individual proteins). These kinds of varying results, depending on the identity of the drug being 
tested, in no way detract from my conclusion that persons skilled in the art on May 22, 1998, having 
read the Lai '521 specification, would specifically request that any 2-D PAGE map that was being used 
for conducting protein expression monitoring studies on drugs for treating neoplastic and reproductive 
disorders (e.g., a toxicology study or any efficacy study of the type that typically takes place in 
connection with the development of a drug) utilize the SEQ ID NO:l and/or SEQ ID NO:2 
polypeptides. Persons skilled in the art on May 22, 1998 would have wanted their 2-D PAGE map to 
utilize the SEQ ID NO:l and/or SEQ ED NO:2 polypeptides because a 2-D PAGE map that utilized 
these polypeptides (as compared to one that did not) would provide more useful results in the kind of 
gene and protein expression monitoring studies using 2-D PAGE maps that persons skilled in the art 
have been doing since well prior to May 22, 1998. 

The foregoing is not intended to be an all-inclusive explanation of all my reasons for reaching 
the conclusions stated in this paragraph 12, and in paragraph 6, supra. In my view, however, it 
provides more than sufficient reasons to justify my conclusions stated in paragraph 6 of this Declaration 
regarding the Lai '521 application disclosing to persons skilled in the art at the time of its filing 
substantial, specific and credible real-world utilities for the SEQ ID NO:l and SEQ ID NO:2 
polypeptides. 

13. Also pertinent to my considerations underlying this Declaration is the fact that the Lai '521 
disclosure regarding the uses of the SEQ ID NO:l and SEQ ID NO:2 polypeptides for protein 
expression monitoring applications is not limited to the use of these proteins in 2-D PAGE maps. For 
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one thing, the Lai '521 disclosure regarding the technique used in gene and protein expression 
monitoring applications is broad (Lai '521 application at, e.g., page 23, lines 3 to 31; and page 34, lines 
2-10). 

In addition, the Lai '521 specification repeatedly teaches that the proteins described therein 
(including the SEQ ID NO:l and SEQ ID NO:2 polypeptides) may desirably be used in any of a 
number of long established "standard" techniques, such as ELISA or western blot analysis, for 
conducting protein expression monitoring studies. See, e.g.: 

(a) Lai '521 application at p. 23, lines 9-12 ("Immunological methods for detecting 
and measuring the expression of PGAMP using either specific polyclonal or monoclonal antibodies are 
known in the art. Examples of such techniques include enzyme-linked immunosorbent assays 
(ELISAs), radioimmunoassays (RIAs), and fluorescence activated cell sorting (FACS)"); and 

(b) Lai '521 application at p. 34, lines 2-10 ("A variety of protocols for measuring 
PGAMP, including ELISAs, RIAs, and FACS, are known in the art and provide a basis for diagnosing 
altered or abnormal levels of PGAMP expression. Normal or standard values for PGAMP expression 
are established by combining body fluids or cell extracts taken from normal mammalian subjects, 
preferably human, with antibody to PGAMP under conditions suitable for complex formation[.] The 
amount of standard complex formation may be quantified by various methods, preferably by 
photometric means. Quantities of PGAMP expressed in subject, control, and disease samples from 
biopsied tissues are compared with the standard values. Deviation between standard and subject 
values establishes the parameters for diagnosing disease"). 

Thus, a person skilled in the art on May 22, 1998, who read the Lai '521 specification, would 
have routinely and readily appreciated that the SEQ ID NO:l and SEQ ID NO:2 polypeptides, 
disclosed therein, would be useful to conduct gene and protein expression monitoring analyses using 2- 
D PAGE mapping or western blot analysis or any of the other traditional membrane-based protein 
expression monitoring techniques that were known and in common use many years prior to the filing of 
the Lai '521 application. For example, a person skilled in the art in May 1998 would have routinely 
and readily appreciated that the SEQ ID NO. l and SEQ ID NO:2 polypeptides would be useful tools 
in conducting protein expression analyses, using the 2-D PAGE mapping or western analysis 
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techniques, in furtherance of (a) the development of drugs for the treatment of neoplastic and 
reproductive disorders, and (b) analyses of the efficacy and toxicity of such drugs. 
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14. I declare further that all statements made herein of my own knowledge are true and that all 
statements made herein on information and belief are believed to be true; and further, that these 
statements were made with the knowledge that willful false statements and the like so made are 
punishable by fine or imprisonment, or both, and that willful false statements may jeopardize the validity 
of this application and any patent issuing thereon. 



L. Michael Furness, B.Sc. 



Signed at Cambridge, United Kingdom 
this day of February, 2002. 
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A two-dimensional gel database of rat liver proteins 
useful in gene regulation and dreg effects studies 

A standard two-dimensional (2-0) protein map or Fischer 344 rat liver 
(F344MST3) is presented, with a tabular listing or more than 1200 protein species. 
Sodium dodecyl sulfate (SDS) molecular mass and isoelectric point have been es- 
tablished, based on positions of numerous internal standards. This map has been 
used to connect and compare hundreds of 2-D gels of rat liver samples from a va- 
riety of studies, and forms the nucleus of an expanding database describing rat 
liver proteins and their regulation by various drugs and toxic agents. An example 
of such a study, involving regulation of cholesterol synthesis by cholesterol-lower- 
ing .drugs and a high-cholesterol diet, is presented. Since the map has been ob- 
tained with a widely used and highly reproducible 2-D gel system (the Iso-Dalt* 
system), it can be directly related to an expanding body of work in other laborato- 
ries. 
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1 Introduction 

High-resolution two-dimensional electrophoresis of pro* 
teins. introduced in 1975 by OTarrell and others (1—4), has 
been used over the ensuing 16 years to examine a wide va- 
riety of biological systems, the results appearing in more 
than 5000 published papers. With the advent of computer- 
ized systems for analyzing rwo-dimensional (2-D) gel ima- 
ges and constructing spot databases, it is also possible to 
plan and assemble integrated bodies of information de- 
scribing the appearance and regulation of thousands of pro- 
tein gene products [5, 6). Creating such databases involves 
amassing and organizing quantitative data from thousands 
of 2-D gels, and requires a substantial commitment in tech- 
nology and resources. 

Given the long-term effort required to develop a protein da- 
tabase, the choice of a biological system takes on consider- 
able importance. While in vitro systems are ideal for answer- 
ing many experimental questions, especially in cancer re- 
search and genetics, our experience with cell cultures and 
tissue samples suggests that some w vivo approaches could 
have major advantages. In particular, we have noticed that 
liver tissue samples from rats and mice appear to show grea- 
ter quantitative reproducibility (io terms of individual pro- 
tein expression) than replicate cell cultures.This is perhaps 
a natural result of the homeostasis maintained in a com- 
plete animal vs. the well-known variability of cell cultures, 
the latter due principally to differences in reagents (r.jr., 
fetal bovine serum), conditions ie.s.. pH) and genetic -evo- 
lution" of cell lines while in culture. It is also more difficult 
to generate adequate amounts of protein from cell culture 
systems (particularly with attached cells), forcing the inves- 
tigator to resort to radioisotope-based or silver-based stain- 
detection methods. While these methods are more sensi- 
tive (sometimes much more sensitive) than theCoomassie 
Brilliant Blue (CBB) stain typically used for protein detec- 
tion in •large* protein samples, they are generally more vari- 
able, more labor-intensive and. in the case of radiographic 
methods, may generate highly -noisy" images, due to the 
properties of the films used. By contrast, large protein sam- 
ples can easily be prepared from liver using urea/Nonidet 
P-JO (NP-40) solubilization and stained with CBB, which 
has the advantage of being easily reproducible (8 J. Finally, 
there remains the question of the "truthfulness** of many /a 
vitro systems as compared to their in vivo analogs; how 
great are the changes caused by the introduction into a oil- 
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(i.e„ 4 mL per Oi g tissue) and the mixture is ho- 
I using first the loose* and then theo the tight-fit- 
x guss pestle. This takes approximately 5 strokes with 
& pestle and is carried out at room temperature because 
£ would crystallize out in the cold. Once the liver sample 
thoroughly homogenized in the soiubili2er.1t is assumed 
it all the proteins are denatured (by the chaotropic effect 
the urea and NP-40 detergent) and the enzymes inacti* 
jed by the high pH ( -9.5 ). Therefore these samples may 
; tept at room temperature until they can be centrifuged 
frozen as a group (uiihin several hours of preparation), 
st samples are centrifuged for 6 X 1 0* g min ( e.g.. 500 000 
j for 12 min using a Beekman TL-100 centrifuge). The 
jgrifuge rotor is maintained at just below room tempera* 
jt (ex- 15-20*C), but not too cold, so as to prevent the 
<ecipitation of urea. The centrifuge of choice is a Beckman 
LrlOO because of the sample tube sizes available, but any 
:tracentrifuge accepting smallish tubes will suffice. When 
\ appropriate centrifuge is not available near the site of 
unple preparation, samples can be frozen at -80 *C and 
awed prior to cemrifugation and collection of superaa- 
mts.Each supernatant is carefully removed following cen- 
ifugation and aliquoted into at least 4 clean tubes for stor* 
$t*.This is done by transferring all the supernatant to one 
leas tube, mixing this gently (to assure homogeneous 
opposition) and then dividing it into 4 aliquots.The ali- 
uots are frozen immediately at -80*C These multiple ali- 
uots can provide insurance against a fziled run or* freezer 
reakdown. 

* 

12 Two-dimensional electrophoresis 
£ 

ample proteins are resolved by 2-D electrophoresis using 
lie 20 X 25 cm Iso-Dalt* 2-D gel system ([26-29]; pro- 
laced by LSB and by Hoefer Scientific Instruments, San 
rrancisco) operating with 20 gels per batch. All first-dimen- 
aonal isoelectric focusing (IEF) gels are prepared using the 
same single standardized batch of carrier ampholytes 
gDH 4-8A in the present case, selected by LSB "s batch- 
wing program for rat and mouse database work**). A 10 
ilsample of solubilized liver protein is applied to each gel, 
5d the gels are run for 33 000 to 34 500 volt-hours using a 
progressively increasing voltage protocol implemented by 
^programmable high-voltage power supply. An'Ange- 
&jue~ computer-controlled gradient-casting system (pro- 
duced by LSB) is used to prepare second-dimensional sod- 
ium dodecyl sulfate (SDS) polyacrylamide gradient slab 
tgb in which the top 5 % of the gel is 1 1 %T acrylamide, and 
fiTlower 95 % of the gel varies linearly from 1 1 % to 18 %T. 

gis system has recently been modified so as to employ a 
JMnmercially available 30.8 %T acrylamide/ A'.A p -methyle- 
acbisacrylamide prepared solution (thus avoiding the han- 
ging of the solid acrylamide monomer) and three addi- 
tional stock solutions: buffer (made from Sigma pre-set 
If*), persulfate and A'.A'.AT.AMetramethylethylenedi- 
.Jaine (TEMED). Each gel is identified by a computer- 
Sgnted filter paper label polymerized into the lower left cor- 
~ £of the gel. First-dimensional IEF tube gels are loaded 



t nuterial (succeeding certified batchet of vhicb «re available from 
oefcr Scientific insuvmeots) has the most linear pH sradient pro- 
weed by any ampholyte t«ud except for the Pharmacia wide range 
♦bich has an unacceptable tendency to bind high-molecular wcifht 

die proteinic causinj them to streak). 



directly (as extruded) onto the slab gels without equilibra- 
tion, and held in place by polyester fabric wedges (Wed- 
gies*, produced by LSB) to avoid the use of hot agarose. 
Second-dimensional slab gels are run overnight, in groups 
of 20, in cooled DAIT tanks (10*C) with buffer circulation. 
All run. parameters, reagent source and lot information, 
and notations of deviation from expected results are ente- 
red by the technician responsible on a detailed, multi-page 
record of the experiment. 

13 Staining 

Following SDS*electrophoresis. slab gels are stained for 
protein using a colloidal Coomassie Blue C-250 procedure 
in covered plastic boxes, with 10 gels (totalling approxima- 
tely 1 L of gel) per box. This procedure (based 00 the work 
of Neuhoff[30,31]) involves fixation in 1.5L of 50% etha* 
nol and 2% phosphoric acid for 2 h. three 30 min washes, 
each in 2 L of cold up water, and transfer to 1.5 L of 34 % 
methanol, 17% ammonium sulfate and 2% phosphoric acid 
for 1 h. followed by the addition of a gram of powdered Coo- 
massie Blue C-250 stain. Staining requires approximately 4 
days to reach equilibrium intensity, whereupon gels are 
transferred to cool tap water and their surfaces rinsed to re- 
move any paniculate stain prior to scanning. Gels may be 
kept for several months in water with added sodium azide. 
The water washes remove ethanol that would dissolve the 
stain (and render the system noncolloidaL with high back- 
grounds). The concentrated ammonium sulfate and meth- 
anol solution is diluted by equilibration with the water vol- 
ume of the gels to automatically achieve the correct final 
concentrations for colloidal staining. Practical advantages 
of this staining approach can be summarized as follows: (i) 
the low, flat background makes computer evaluation of 
small spots (max OD < 0.02) possible, especially when 
using laser densitometry; (ii) up to 1500 spots can be reli- 
ably detected on many gels (e.g., rat liver) at loadings low 
enough to preserve excellent resolution; and (iii) reprodu- 
cibility appears to be very good: at least several hundred 
spots have coefficients of reproducibility less than 15%. 
This value is at least as good as previous CBB methods, and 
significantly better than many silver stain systems. 

2.4 Positional standardization 

The carbamylated rabbit muscle creatine phosphokinase 
(CPK) standards 132] are purchased from Pharmacia and 
BDH. Amino acid compositions, and numbers of residues 
present in proteins used for internal standardization, are 
taken from the Protein Identification Resource (PIR) se- 
quence database [33]. 



2.5 Computer analysis 

Stained slab gels are digitized in red light at 134 micron re- 
solution, using either a Molecular Dynamics laser scanner 
(with pixel sampling) or an Eikonix 78/99 CCD scanner. 
Raw digitized gel images are archived on high-density DAT 
tape (or equivalent storage media) and a grayscale video- 
print prepared from the raw digital image as hard-copy 
backup of the gel image. Gels are processed using the Kep- 
ler* software system (produced by LSB), a commercially 
available workstation-based software package built on 
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«e include here a useful series of 22 orienting identifi- 
es is an aid to otherusers of the m iWerpattere (Table 



f Carbamylated chute standards, eompnud pfs and 
.'molecular mass standardization 

have previously shown that the use of a system of close- 
^ptced internal pi markers (made by carbamylating a 
JTprotein) offers an accurate and workable solution to 
lt problem of assigning positions in the pi dimension [32]. 
he same system, based on 36 protein species made by car- 
ijylating rabbit muscle CPK. has been used here toas- 
.jn pfs w niost rat liver acidic and neutral proteins. The 
Jjndards were coelectropboresed with total liver proteins, 
«d the standard spots added to a special version of the 
ttitcr pattern F344MST3. The gel A-coordinates of all 
ver protein spots Iving within the CPK charge train were 
ben transformed into CPK p/ positions by interpolation 
>etwcen the positions of immediately adjacent standards 
Table 1) using a Kepler* vector procedure. 

t bas proven possible to compute fairly accurate p! values 
of many proteins from the amino acid composition 142]. 
ft have attempted here to test a funber elaboration of this 
ippToach. in which we computed pT s for the CPK standards 
ihemselves, based on our knowledge of the rabbit muscle 
CPK sequence and the fact that adjacent members of the 
&rge train typically differ by blockage of one additional ly- 
&e residue (Table 3). We compared these values to similar 
Smputed pT% for an additional set of carbamylated stand- 
Sis made from human hemoglobin beta chains and a se- 
ties of rat liver and human plasma proteins of known posi- 
tion and sequence (Fig.7.Table 4).Tbe result demonstrates 
good concordance between these systems. Two proteins 
show significant deviations: liver fatty-acid binding protein 
(FABP; #1 in Table 4) and protein disulphide isomerase 
(bo in the table). The FABP spot present on F344MST3 
may represent a charge-modified version of a more basic 
Jirent spot closer to the expected p/, not resolved in the 
IEF/SDS gel. Of particular importance is the fact that, by 
comparing computed p/s of sequenced but unlocated pro- 
teins with the CPK pfs, we can assign a probable gel loca- 
jjon without making any assumptions regarding the actual 
gel pH gradient. This otters a useful shortcut, given the va- 
garies of pH measurement on small diameter IEF gels. We 
"five used this approach to compute the CPK pfs of all rat 
fid mouse proteins in the PIR sequence database, as an aid 

! protein identification (data not shown), 
order to standardize SDS molecular weight (SDS-MW), 
Je have used a standard curve fitted to a series of identified 
Jjrpteins (Fig. 8). Rather than using molecular mass per st % 
*£ have elected to use the number of amino acids in the 
Polypeptide chain, as perhaps a better indication of the 
i£pgtb of the SDS-coated rod that is sieved by the second 
icnsion slab. The resulting values were multiplied by 
^ (the weighted average mass of amino acids in se- 
tenced proteins) to give predicted molecular masses. Be- 
se we use gradient slabs, we have not constrained the fit- 
curve to conform to any predetermined model; rather 
fc tried many equations and selected the best using the 
►gram -Tablecurve" on a PC. The equation chosen wasy 
c/y, where .y is the number of residues,xis the gel 



^coordinate, a is 51 1.83, * is -0 .273 1 and e is 33 183801. The 
resulting fit appears to be fairly good over a broad range of 
molecular mass. 

33 An example of rat liver gene regulation: Cholesterol 
metabolism 

Experiment LSBC04 was designed as a small-scale test of 
the regulation of cholesterol metabolism in vivo by three 
agents included in the diet: lovastatin (MevacorVan inhibi- 
tor of HMG-CoA reductase); cholestyramine (a bile acid 
sequestrant that has the effect of removing cholesterol 
from the gut-liver recirculation); and cholesterol itself. The 
first two agents should lower available cholesterol and the 
third should raise it, allowing manipulation of relevant 
gene expression control systems in both directions. Such 
an experiment offers an interesting test of the 2-D mapping 
system since most of the pathway enzymes are present in 
low abundance, many are membrane-bound and difficult 
to solubilize v and the pathway itself is complex. Approxima- 
tely 1000 proteins were separated and detected in liver bo- 
mogenaes.Twenty-one proteins were found to be affected 
by at least one treatment, and these could be divided into 
several coregulated groups. 

3 .3.1 MSN 413 (putative cytosolic HMG-CoA synthase) 
and sett of spots regulated coordinate^ or inversely 

One group of spots (including a spot assigned to the cyto- 
solic HMG-CoA synthase, MSN 413) showed the expected 
increase in abundance with lovastatin or cholestyramine, 
the synergistic further increase with lovastatin and choles- 
tyramine, and a dramatic decrease with the high cholesterol 
diet. Spot number 413 is the most strongly regulated pro-, 
tein in the present experiment, showing a 5- to 10-fold in- 
duction aftera 1 week treatment with 0.075% lovastatin and 
1% cholestyramine in the diet (Figs. 9 and 10). Us expres- 
sion follows precisely the expectation for an enzyme whose 
abundance is controlled by the cholesterol level; it is pro- 
gressively increased from the control levels by cholestyra- 
mine, lovastatin and lovastatin plus cholestyramine, and it 
sinks below the threshold of detection in animals fed the 
high cholesterol diet. This spot has been tentatively identi- 
fied as the cytosolic HMG-CoA synthase, based on a reac- 
tion with an antiserum to that protein provided by Dr. Mi- 
chael Greenspan at Merck Sharp L Dohme Research Labo- 
ratories. This enzyme lies immediately before HMG-CoA 
reductase in the liver cholesterol biosynthesis pathway, and 
is known to be co-regulated with it. Spot 413 has an SDS 
molecular weight of about 54 000 and a CPK pi of -1 1.4, in 
reasonably close agreement with a molecular weight of 
57300 and a CPK pi of- 15.7 computed from the known se- 
quence of the hamster enzyme [43). 

Using a classical product-moment correlation lest (Kepler 
procedure CORREL), a series of five additional spots was 
found to be coregulated with 413. The level of correlation 
was exceedingly high (> 95%). Two of these, 1250 and 933, 
are at similar molecular weights and approximately one 
charge more acidic than 413 (Fig. 9), indicating that they 
may be covalently modified forms of the 413 polypeptide. 
This suspicion is strengthened by the observation that both 
spots are also stained by the antibody to cytosolic HMG- 
CoA synthase.The remaining three correlated spots appear 
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figure 10. Bargraph showing the quintiu- 
live effects of various treatments on the 
abundance of MSN:4!3 (cyiosolic HMC* 
CoA synthase) in the gets of Fig. 9. 
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Ftgvrt II. Bargraphs of a series of six core- 
gulated spots including MSN:413. In the 
barfnphs. the abundances of the appro- 
priate spot (master spot number shown at 
the top of the panel) in each animal are 
shown. The five five-taimaJ groups are in 
the order (left to right): high cholesterol, 
controls, cholestyramine, lovastatin, and 
lovsstatin plus cholestyramine. Each bar 
within a group represents one experimen- 
tal animal liver (one 2*0 gel). Note the cor* 
related expression of the 6 spots, espe* 
cully in the two far right (most strongly in* 
duced) groups. 
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624 
506 
567 
297 
312 
407 
692 
296 
569 
545 
583 
556 
621 
564 
363 
565 
736 
696 
363 
681 
347 
563 
479 
301 
1371 
696 
718 
329 
710 
545 
446 
696 



«4S3 
•243 
•16.0 
-25*2 
-153 
-21.6 
•14.0 
-17 3 
-20.6 
.6.7 
«-35.0 
.16.6 
«-35.0 
.16.1 
-9.0 
4.0 
-17 J 
.17 3 
46 
-6.5 
-11 J 
-14.9 
-18.7 
-17 J 
4-35.0 
-8.2 
-19.6 
.7.3 
41 
47 
4.3 
<-35.0 
-22.5 
-21.6 
-10.0 
4.9 
•163 
>0.0 
-18.4 
.19.8 
-2.5 
-10.3 
4.2 
4.2 
4.6 
44 
-18.1 
4.0 
4.1 
-16.6 
•10.8 
-20.6 
-21.2 
-36 
-3.8 
4.0 
-7.0 
4.6 
•1.5 
-13.6 
•26.1 
-1.0 
4.0 
•5.0 
-2.7 
4.4 
4.9 
•27.0 
-3.5 
-23 
-20.8 
4.0 
-1.4 
•7.0 
-2.2 



63.600 
102.900 
64.600 

101.000 
56.200 
50.000 



USN 



Bil lo w cf nu U*tt atoituu 923 



T CPKtf SOSMW 



90.200 
67.900 
62.100 
63.800 
65.000 
66.000 
55.500 
64.900 
62.400 
49.000 
348.600 
66.000 
62.500 
52.400 
66.600 
48.900 
43.800 
59.600 
51.400 
48.600 
50.000 
74.600 
50.2CC 
62.300 
61.500 
50.100 
53.900 
55.000 
57,000 
170.600 
56.900 
37.300 
54.100 
69.000 
50.60C 
50.300 
47.800 
56.200 
51.500 
90300 
85.900 
67,300 
43.900 
90.80C 
50.000 
53.10C 
50 400 
52.300 
48.000 
51.600 
74.400 
51.700 
41.600 
43.600 
74.500 
44.500 
77.500 
51.600 
56.900 
69,100 
17.400 
43.600 
42.500' 
81.700 
43.000 
53.200 
62300 
43.700 



96 
96 
67 



100 
101 
102 
103 
104 
106 
106 
107 
106 
109 
110 
111 
113 
114 
115 
116 
117 
119 
120 
121 
122 
123 
124 
125 
126 
127 
128 
129 
IX 
131 
132 
133 
134 
135 
136 
137 
138 
139 
140 
141 
142 
143 
144 
145 
146 
147 
148 
149 
150 
151 
152 
153 
154 
155 
156 
157 
158 
159 
160 
161 
162 
164 
166 
167 
168 
166 
170 
171 
172 
173 



1119 
1731 
1033 
1406 
571 
2004 
1106 
482 
665 
773 
312 
17G9 
1565 
1602 
1482 
778 
1728 
1191 
1298 



536 
756 
566 
565 
1149 



1146 
1548 

1060 
1530 
638. 
1572 
23 
621 
1298 
872 
1000 
1229 
1422 
1776 
1630 
660 
666 
1271 
1161 
453 
1858 
1504 
1488 
1669 
311 
1366 
1429 
615 
2006 
2006 
1070 
1347 
541 
1645 
1269 
1507 
1722 
932 
1031 
1970 
1256 
1275 
1663 
1034 
1953 
1020 
1566 
1005 
1340 
1506 
1338 
1966 
600 
476 
916 



623 
4S5 
830 
1182 
1117 
509 
720 
607 
SB3 
516 
700 
660 
165 
607 
610 
•49 
577 
826 
423 
712 
1433 
1474 
662 
921 
717 
311 
832 
499 
757 
537 
1019 
862 
1389 
1063 
823 
697 
707 
756 
1417 
915 
348 
1017 
566 
516 
1106 
578 
1481 
760 
236 
811 



503 
294 
664 
163 
417 
820 
527 
771 
1482 
606 
565 
161 
563 
678 
541 
378 
958 
1314 



4.8 
-2.0 
-11.4 
4.1 
-23J 
>0.0 
-10.1 
-26.5 
■20.2 
-17.0 
<35.0 
-13 
4.6 
-24 
-43 
-16.9 
-2.0 
4.9 
-73 
-19.6 
43 
-4.1 
-11.1 
-43 
-15.4 
43 
<-353 
-213 
-73 
-14.7 

•12.0 
4.4 
-53 
*1.4 
4.1 
*20.4 
.20.2 
-7.9 
43 
-29.7 
4.6 
-4.6 
-43 
-2.4 
<-35.0 
4.7 
-5.7 
-22.1 
>0.0 
>0.0 
-10.7 
4.9 
-25.7 
•2.6 
-7.9 
-43 
-2.1 
-133 
-11.4 
>0.0 
4.1 
-7.8 
-23 
-11.4 
>0.0 
•113 
•33 
43 
-7.0 
-43 
-7.0 
>0.0 
-163 
•26.7 
13.7 



53300 
40.700 
51.600 
51.700 
21000 
53.700 
47.900 
61JO0 
37300 
23.800 
26.100 
56.100 
4230C 
36300 
49.700 
S5.50C 
433O0 
4430C 
16030C 
34.100 
46.70C 
36.600 
5030C 
37.40C 
6530C 
42.90C 
1530C 
13.90C 
36.00C 
33.5a 
42.6a 

66.ia 
373a 
S7.oa 

40.70C 
S3.80C 
29.70C 
36.00C 
16.60C 
28.10C 
37.70C 
43.7a 
43.20C 
40.7a 
15.60C 
33.600 
77.900 
29.800 
51.600 
55.300 
26.500 
50.600 
13.700 
40.500 
117.000 
33.900 
62.100 
56.600 
91.400 
44.400 
1 62.400 
65.900 
37.800 
54.600 
40.000 
13.700 
38.400 
51.700 
164.900 
50.400 
44.700 
53.500 
71.600 
32.100 
19300 



174 
175 
177 
178 
179 
160 
161 
183 
164 
165 
166 
167 
186 
191 
192 
183 
194 
195 
196 
197 
196 
199 
200 
201 
202 
203 
204 
205 



163 
393 
563 
710 
615 
567 



1364 
835 
1562 
1321 
1009 
1866 
411 295 
604 730 
I860 896 
1997 
279 
773 
1536 
1560 
1616 
1489 
1380 
784 
1227 
667 



1711 
872 



207 

206 

210 

211 

213 

214 

215 

216 

217 

218 

219 

220 

221 

223 

225 

226 

227 

226 

229 

230 

232 

234 

235 

236 

237 

236 

239 

240 

241 

242 

243 

244 

245 

246 

247 

248 

249 

250 

251 

252 

253 

254 

255 

256 

257 

258 



736 
766 
1224 
439 
1994 
1695 
240 
1700 
902 
1067 
1340 
1501 
1565 
1159 
931 
713 
1479 
965 
934 
1812 
621 
1566 
1065 
1577 
1456 
1440 
1692 
618 
920 
952 
1611 
1489 
501 
1820 
1357 
711 
1855 
1169 
551 
1348 
460 
1733 
1974 
606 
674 
753 
995 
1690 
994 
506 
1517 



1017 
1113 
296 
607 
674 
687 
555 
266 
632 
1185 
553 
681 
674 
424 
435 
253 
829 
569 
963 
571 
667 
1418 
499 
517 
684 
668 
495 
755 
393 
572 
177 
911 
927 
716 
1045 
411 
1483 
567 
890 
496 
849 
489 
1004 
1138 
1008 
541 
720 
448 
569 
656 
1162 
621 
474 
459 
604 
448 
451 
788 
392 
553 
648 
450 
679 
1006 
464 
620 



4.7 
-15.7 
4.6 
-73 
-10 4 
43 
-32.1 
-163 
43 
>O0 
<-35.0 
-17.0 
-43 
-3.9 
4.9 
-5.0 
44 
-16.7 
44 
-20.1 
>0.0 
-23 
-14.7 
<-35.0 
-16.0 
-16.7 
4.5 
•303 
>0.0 
4.3 
«*35.0 
-23 
-14.1 
-10.4 
-7.0 
4.5 
•3.6 
4.3 
-13.5 
-18.7 
-4.9 
•12.6 
•13.5 
-1.0 
-153 
•3.6 
-10.8 
-3.7 
•53 
-5.5 
-Z4 
•22.0 
-13.7 
-13.1 
-3.2 
•4.6 
-27.7 
4.9 
4.6 
-18.7 
4.6 
4.9 
-25.1 
4.9 
-293 
•1.9 
>0.0 
•16.1 
■14.6 
•17.6 
12.1 
-2.4 
12.1 
27.4 
-44 



162.600 
69.300 
52.600 
43.000 
48300 
51.600 
91300 
42.000 
'34.500 
29.800 
26.300 
90.600 
38.400 
44.900 
44300 
52.400 
101.600 
47300 
23.700 
52.800 
44.500 
44.900 
65.000 
63.700 
107.800 
37.400 
50.000 
31.100 
51.300 
44.200 
15.800 
57.000 
55.400 
44.400 
45300 
57300 
40,700 
69.300 
51300 
170.500 
33.900 
33.300* 
42,700 
26.600 
66.600 
13.600 
51.600 
34.800 
57.300 
36.500 
57.900 
30300 
25.400 
30.200 
53.500 
42.500 
62.100 
51.400 
45.600 
23.800 
48.000 
59.300 
61.000 
49.100 
62.100 
61.800 
39.200 
69.500 
52.500 
36.500 
61.900 
44.600 
30300 
60.400 
37.800 



isoelectric point relative to CPK standards, and 




0*un* of rat h*cr 



92S 



Y CPKfif SO&ftV 



MSN 



Y CPK* SOSMW 



MSN 



Y CP** SOSMW 



fn *o 

512 1000 

so i«» 
si4 we 

51$ 481 
511 1334 
$17 066 
511 796 
510 S22 
53D 

521 1332 
SZ2 *03 
523 1190 
534 479 

525 760 

526 747 

537 1170 
52B 1502 
530 1720 
532 507 
Sn 870 

534 1347 

535 1513 

536 306 

538 1651 
530 1463 

540 900 

541 625 

542 1164 

543 803 

544 1259 

545 856 
545 803 

547 1162 

548 126 
540 1355 
550 505 
552 1360 
S3 992 

555 1125 
566 705 
557 1477 

556 960 
550 700 
5® 1028 
562 896 

564 789 

565 777 

566 980 

567 1519 
5« 1212 

570 760 

571 618 
573 1U2 
5^4 532 
575 771 

1068 
577 622 
914 
1064 
580 1524 
1392 
982 



533 
1034 
638 
543 
1044 
1021 
779 
670 
165 
830 
1104 
300 
1226 
1066 
1016 
231 
542 
620 
1011 
409 
1085 
346 
654 
669 
962 
561 



*2 ™ 
564 1437 

*5 .756 

667 

930 

ieae 

642 



566 
567 
583 

569 _ 

*0 1317 

£ 65 

*2 tow 

2? 732 

£ 1627 

5?5 1000 



196 
655 
1143 
1526 
1071 
274 
1321 
1122 
866 
494 
405 
410 
975 
1030 
583 
1109 
621 
794 
1446 
766 
328 
611 
661 
594 
956 
771 
787 
250 
534 
734 
754 
794 
714 
783 
686 
672 
731 
1152 
523 
774 
485 
519 
1548 
614 
176 
478 
1426 



-16.0 
-10,2 
•24 
•13.2 
-20.5 
-7.1 
-14.6 
-16 4 
-15.7 
•21.5 
-7.1 
•226 
-8.9 
-28.6 
-17.2 
•17.7 
■84 
-4.6 
-2.0 
-27.4 
-14.7 
-6.0 
-4.5 
<-35.0 
-0.7 
-5.1 
-13.9 
-21.7 
•9.2 
16.2 
-8.0 
-15.0 
•16.2 
-0.3 
«35.0 
-6.8 
•23.0 
-6.6 
♦12.2 
•9.8 
•18.9 
-4.0 
•12.5 
-10.1 
-11.5 
-14.1 
•16.6 
-16.0 
-12.5 
-4.4 
-8.6 
-17.4 
-21.0 
-9.6 
•26.2 
•17.1 
•10.8 
-15.7 
-13.8 
•10.6 
•4.4 
-64 
-12.4 
-4.6 
-17.4 
-19.5 
-13.5 
-04 
•21.1 
-74 
<-35.0 
-11.7 

-16.1 1 

-3.0 
•11.6 



58.400 
54.100 
20.200 
47.100 
53 400 
20.000 
29.700 
39 600 
45.100 
189.000 
37400 
26.600 
86.800 
22.300 
28.000 
29.800 
119.800 
53.400 
48.000 
X.000 
57.900 
27JO0 
77.800 
46.000 
44.100 
31.100 
52.000 
03.100 
146400 
45.900 
25400 
12400 
27.800 
98.400 
19.000 
25.900 
35.800 
57.500 
67.600 
66.900 
31.400 
29.300 
50.400 
26.400 
46.000 
36.900 
14.900 
40400 
81.900 
48.600 
45.600 
48.700 
32.100 
40.000 
39400 
109400 
54.100 
41.000 
40.800 
38.900 
42.000 
39.400 
44400 
45.000 
41.900 
24.900 
55.000 
30.900 
58.300 
55.30Q 
11,500 
48.400 * 
72400 
59.000 
15.500 



506 610 

587 1176 

SOB 1465 

SflO 741 

600 907 

601 687 

602 712 
600 608 
604 703 
606 736 



610 
612 
613 
614 
615 
616 
617 
618 
619 
630 
621 



624 
625 
626 
627 
628 



2012 
12S5 
1103 
770 
-824 
1005 
1750 
994 
751 
1429 
1050 
923 
1462 
750 
758 
1438 
1096 
042 



630 
631 
€32 
633 
634 
635 
636 
637 
636 
639 
640 
641 
642 
643 
644 
645 
646 
648 
640 
650 
651 
652 

653 

654 

655 

656 

657 

658 

650 

660 

661 

662 

663 

664 

665 

666 

667 

666 

660 
670 
671 
673 



461 
1044 
1100 
402 
658 
1138 
101 
1461 
223 

606 628 273 

607 1064 206 
500 
610 
003 
391 
265 
516 
195 
470 
372 
374 
516 
520 

1105 
622 
225 
1038 
606 
1069 
548 
621 
979 
1321 
615 
1076 
614 
050 
704 
604 
524 
411 

575 
292 
1224 
251 
296 
294 
1263 
1038 
204 
1406 
1049 
1163 
616 
1165 

551 
061 
540 
060 
564 
565 
166 
312 
567 
268 
775 
221 
227 
165 
353 
643 
709 
746 



1135 
079 
1542 
1345 
400 
1165 
774 
1263 
052 
1717 
094 
165 
603 
719 
1100 
534 
1153 
1246 
14 
1713 
1966 
1378 
1442 
650 
1111 
1005 
1524 
1777 
301 
077 
658 
732 
1787 
866 
809 
715 
781 
646 
1116 
1382 
547 



*14 
-0.1 
-50 
•174 
•14.0 
-104 
-10.7 
-14.1 
•16.7 
-16.0 
-214 
•104 
•144 
•04 
-6.1 
-10.1 
-164 
-15.7 
-10.3 
-1.6 
-12.1 
-17.6 
-5.7 
-11.1 
•13.7 
-5.1 
-17.4 
-17.4 
-5.5 
•104 
•13.3 
-16.0 
-14.1 
-0.6 
•124 
-4.1 
-64 
•324 
-04 
-17.0 
-6.0 
-13.1 
-2.1 
•12.1 
<-35.0 
•164 
-184 
•10.2 
•26.1 
-0.4 
-84 
«-35.0 
•2.1 
>0.0 
-6.5 
•5.5 
•20.6 
-10.0 
-104 
-4.4 
-1.4 
-33.4 
-124 
-204 
-18.1 
-1.2 
•14.4 
-144 
•16.6 
•16.6 
-21.0 
-0.0 
-64 
-254 
-12.4 



100400 
60.700 
28400 
23.600 
68.000 
45400 
25.400 
165400 
14.400 
125400 
08.700 
94.000 
56.700 
48.700 
34400 
60.600 
102.000 
55.400 
149.100 
50.000 
72.000 
72.400 
55400 
55400 
26.600 
47.000 
124,000 
29.000 
48.000 
27400 
53.000 
48.000 
31400 
19.100 
48400 
27,600 
36,000 
32.400 
43400 
49.000 
54.800 
66.700 
51.000 
92.000 
22.400 
106.900 
90.700 
01.400 
21.000 
29.000 
140.000 
16400 
28.600 
23.800 
38.000 
24.400 
38.400 
52.700 
36.000 
53.600 
36.000 
50.400 
51.700 
187.500 
86.100 
51.500 
100.900 
39.800 
126.300 
122.400 
100.100 
76400 
46.600 
30400 
41400 



674 
675 
676 
677 
670 
670 



661 
662 



685 
666 
687 
668 
680 
600 
601 
692 
693 
694 
695 
606 
697 



702 

703 

705 

706 

707 

709 

710 

712 

713 

714 

715 

716 

717 

718 

719 

721 

722 

723 

724 

725 

726 

727 

728 

729 

730 

731 

733 

734 

735 

736 

738 

739 

740 

741 

742 

743 

744 

745 

746 

748 

749 

750 

751 

752 

754 

755 

756 

757 

760 



1661 
1523 
708 

010 
1085 
600 
1237 
1103 
1406 
1506 
555 
1167 
1032 
1545 
1456 
1011 
1905 
612 
1154 
1993 
1628 
928 
1854 
1997 
957 
1540 
577 
1610 
1278 
1641 
1018 
1074 
293 
720 
1366 
1320 
608 
701 
1875 
575 
1216 
1069 
1272 
958 
763 
720 
1476 
1846 
510 
1217 
1658 
665 
1321 
719 
1101 
1359 
696 
667 
1205 
995 
696 
681 
1951 
726 
909 
182 
2005 
1448 
702 
469 
664 
1105 
1821 
009 
700 



562 
642 
615 
551 
023 
10O4 
2B3 
477 
240 
600 
1313 
790 
619 
764 
053 
270 

1461 
810 
656 
254 
715 
345 
563 
730 



562 
571 
704 
1386 
1145 
680 
412 
841 
263 
433 
481 
680 
702 
204 
464 
506 
622 
395 
916 
415 
473 
783 
1126 
724 
765 
312 
427 
473 
569 
220 
409 
256 
563 
506 
101 
666 
168 
643 
1503 
640 
575 
266 
206 
254 
184 
1113 
246 
133 



-2.7 
-44 
-16.8 

-13.7 
-10.5 
-22.7 
-8.3 
•10.1 
-6.1 
•34 
•24.6 
•0.2 
0.0 
-4.1 
-54 
-114 
>0.0 
-16.0 
-04 
>0.0 
-3.0 
•13.6 
«0.6 
>00 
-13.0 
-44 
•234 
-34 
•74 
•07 
-11.7 
-10.7 
<-35.0 
•184 
-6.4 
-7.1 
-10.1 
-10.0 
-0.5 
-23.0 
-6.6 
•10.8 
•7.9 
•13.0 
•17.3 
-184 
-4.9 
-0.7 
-274 
-6.6 
-0.6 
•204 
-74 
-18.5 
-10.2 
-67 
•194 
-19.5 
-8.7 
-12.1 
-14.1 
-14.5 
>O0 
•184 
-12.0 
•35.0 
>0.0 
•5 4 
•16.5 
-28.9 
-204 
-6.6 
-0.9 
■134 
•16.5 



62.100 
51.900 
46.700 
48.300 

52.700 
33 400 
30.300 
95.100 
50.100 
109.800 
43.500 
19.300 
39.100 
48.100 
40400 
32.300 
100.200 
34.900 
14 400 
37.800 
45.000 
107.000 
42.700 
78.000 
51.600 
42.000 
34.400 
51.000 
51.200 
43.300 
16.000 
25.100 
34.800 
66.600 
36.800 
103.100 
63.900 
56.700 
43.600 
43.400 
140.400 
60.400 
56.400- 
37.700 
69.100 
33,700 
66.200 
59.400 
39.400 
25.800 
42.300 
40.300 
85.900 
64.600 
50.500 
51.400 
127.600 
67.000 
106400 
51.900 
40.500 
165.900 
44400 
183.800 
46.600 
13.000 
46.300 
51.000 
101.900 
00.600 
.107.000 
161.000 
26400 
111.000 
264.900 



l991./2.90VttO 



OitltaK of Ml lt*ff proutni 



MSN 



y cpw sosmw 



£| 405 SS2 

fCS7 ia» 

10 BS6 S47 
1Q 30 12B4 226 
10J1 866 822 

1032 1S47 403 

1033 1381 SSI 

1034 1S25 406 
1C05 1128 
1036 1226 
1039 1761 
10<0 S41 
1011 816 

1044 1036 

1045 1439 
1047 1540 

1046 1576 
1049 1089 

1060 049 

1061 426 

1062 15S3 

1063 779 1082 

1064 1613 620 

1065 1360 

1066 2B4 
1066 1261 

393 
1617 
1245 



1060 
1061 
1062 



•023 
•7.5 
-15.0 
-7.7 
-12 J 
-4.1 
-64 
-4.3 
-6.7 
•85 
-1.6 
-2S.7 
15 6 
-11 J 
-5.5 
-45 
-3.7 
104 
13.2 
31.1 
-36 
16.6 
-3.2 

377 -8.5 
663 «35.0 

-eo 



645 
274 
262 
839 
910 
485 
407 
2S0 
635 
411 

1040 
616 

1385 



1064 1256 

1065 705 

1066 1161 

1067 529 
1066 506 
1069 1896 
1071 873 
1073 1766 

1075 836 

1076 1863 



1078 
1061 
1083 
1065 
1090 



626 
971 
1697 
1157 
620 



1092 1867 
1 0B3 2019 



1094 
1095 
1096 
1009 
1101 
!102 
'103 
105 
106 
107 



1546 
1545 
61 
1954 



746 
60S 
645 
746 
792 
934 
734 
656 
696 
604 
609 
1128 
773 
861 
S66 
483 
202 
794 
910 
597 
894 
S38 
477 
935 
237 



586 1 048 

1050 867 



457 
1864 
1714 

1717 



106 1 976 
:111 S47 



112 
115 



116 1078 

117 975 
118 

; iie 

120 

121 . 

122 

123 
'125 

126 
128 
133 
139 
147 
148 



797 
532 
649 
546 
722 
1066 
621 
762 
616 
787 

1202 933 
1022 1076 



1348 
1385 



1905 
1512 
1114 
1464 
1048 
1122 
1722 
1096 
1630 



616 
1301 
677 
452 
657 
802 
892 
825 
569 



764 1152 
1968 724 



•335 
•0.6 
-8.2 
4.1 
-16.9 
-90 
-26.3 
-274 
•05 
-14.7 
-1.5 
-15 4 
-0.6 
-15.7 
-12.7 
-2.3 
-9 4 
-21 .9 
-0.5 
>O.0 
-4.1 
-4.1 
<-35.0 
>O.0 
-23.3 
-11.1 
-29.5 
-04 
-2.1 
-2.1 
>0.0 
-2S.3 
-6.9 
-64 
-10.6 
-12.6 
•8.7 
•11.6 
•0.3 
•4.5 
•9.9 
-5.1 
-11.1 
-9.8 
-2.1 
-10.2 
-0.6 
•17 J 
>0.0 



52.800 
36500 
53.000 
123.200 
37.700 
67.900 
52.700 
57.200 
46.500 
98.300 
103.600 
36.900 
34.000 
56.300 
67.300 
100.200 
47.100 
86.700 
28.900 
37.800 
16.900 
27.000 
48.000 
72.000 
45.500 
41.200 
49.000 
46.800 
41.200 
39.000 
33.000 
41.800 
45.800 
43.700 
49.100 
48.700 
25.800 
39.900 
36.000 
51.600 
56.500 
142300 
38.900 
34.000 
49.500 
34.600 
53,700 
59.100 
33.000 
116.000 
28.600 
45.200 
38.800 
54.200 
46.300 
53.100 
42400 
28,000 
48.000 
40.400 
36.000 
39.300 
33.100 
27.600 
48.300 
19.700 
44.700 
61.700 
36.200 
36.600 
34.700 
37.500 
51.400 
23.800 
42,300 



1153 
1154 
1181 
1162 
1163 
1166 
1170 
1171 
1172 
1174 
1176 
1177 
1171 
1179 
1180 
1181 
1182 
1183 
1184 
1165 
1186 
1189 
1190 
1191 
1192 
1193 
1194 
1195 
1196 
1197 
1198 
1199 
1200 
1201 
1202 
1203 
1204 
1205 
1208 
1206 
1210 
1211 
1212 
1214 
1215 
1216 
1217 
1218 
1219 
1220 
1221 
1222 
1223 
1224 
1225 
1226 
1227 
1228 
1229 
1230 
1231 
1232 
1233 
1234 
1235 
1236 
1237 
1236 
1239 
1240 
1241 
1242 
1243 
1244 
1245 



921 
1564 
637 
823 
665 
564 
562 
538 
545 
1089 
1304 
1386 
1806 
1485 
1459 
1431 
1407 
1383 
1454 
1422 
1394 
1171 
1457 
686 
265 
400 
344 
506 
572 
639 
637 
614 
637 
1095 
1719 
791 



1156 
864 
400 
397 
397 
S28 
529 
524 

514 

522 
586 
539 
702 
224 
224 



313 
306 
320 
326 
394 
402 
386 
641 
660 
914 
673 
970 
1021 
1392 
1354 
1362 
673 
614 
603 
666 
707 
475 
466 
756 
1324 
1583 
1865 
1812 
1411 
1392 
794 
769 
740 
743 
713 
682 
663 
585 



224 
162 
183 
182 
214 
286 
1114 
803 
1292 
1275 
1311 
1293 
1502 
1402 
1407 
1431 
1394 
1545 



1021 
195 
194 
197 
197 
294 
294 
294 
329 
329 
266 
245 
372 
296 
205 
203 
205 
540 
542 
539 
623 
628 
447 

1282 

1461 

1170 

1005 

809 

617 

703 

682 

410 

407 

406 

511 

510 

509 

504 

562 



-13.7 
-3.5 
-21 J 
-21 J 
-20.2 
-24.4 
-25.0 
•25.8 
-255 
-10.2 
-75 
•6.6 
•35 
-45 
.-5-2 
-5.7 
•6.1 
•64 
-55 
-55 
-€5 
-8.2 
-55 
-195 
<-35.0 
-325 
<-35.0 
-275 
-24.1 
-21.2 
-21 J 
•22.1 
-215 
•105 
-2.1 
-16.5 
-12.9 
<-35.0 
<-35.0 
<-35.0 
<35.0 
-33.2 
•32.7 
-33.7 
-21.2 
-20.4 
•135 
•14.7 
•12.7 
-11.6 
-65 
•65 
-6.7 
-19.9 
-22.1 
-225 
-19.2 
-18.9 
-28.7 
•29.0 
-17.4 
-7.2 
•3.6 
-0.6 
-1.0 
•6.0 
-6.3 
-164 
•17.1 
-17.9 
-175 
-16.7 
•19.6 
-205 
-24.4 



24.700 
35.900 
68.400 
68.800 
86.700 
54.500 
54.S00 
54.800 
55.700 
55.000 
50.200 
53.700 
43.400 
124.900 
124.900 
125.100 
125.200 
124.700 
164.400 
162600 
164500 
131.800 
94.200 
26400 
34.700 
20.000 
20.600 
19.400 
20.000 
13.000 
16500 
16.200 
15.400 
16.600 
11.600 
45.200 
29.700 
148.700 
149.800 
147.400 
146.600 
91.400 
91.200 
91.400 
61.600 
61.600 
101.800 
112.000 
72900 
90.100 
139.500 
141.800 
139.500 
53.600 
53 400 
53.600 
47,800 
47.500 
62.300 
20.400 
14.400 
24.200 
30.300 
36.200 
37.900 
43.400 
44.500 
66.900 
67.300 
67.500 
55.600 
56.000 
56.100 
56.500 
50.500 



1246 

1247 

1246 

1250 

1251 

1252 

12S3 

1254 

1255 

1257 

1258 

1258 

1260 

1261 

1262 

1283 

1284 

1265 

1266 

1267 

1268 

1269 

1270 

1271 

1272 

1273 

1274 

1277 

1278 

1279 

1280 

1281 

1282 

1283 

1284 

1285 

1286 

1267 

1288 

1289 

1290 

1291 
1292 
1293 
1294 
1295 



547 
530 
516 
973 
807 
665 
809 
1311 
1300 
1938 
1806 
1727 
1629 
1555 
1468 
1413 
1340 
1263 
1182 
1110 
10S5 
909 
050 
905 
857 
610 
774 
737 
702 
671 
645 
617 
595 
573 
S52 
536 
515 
496 
487 
447 
427 
412 
397 
381 
365 
348 



$77 
571 

572 
536 

532 
529 
786 
746 

761 

712 

719 

715 

713 

717 

717 

722 

717 

717 

720 

717 

717 

717 

715 

712 

714 

70S 

711 

708 

711 
710 
710 
707 
704 
700 
695 
894 
867 
683 
869 
667 
655 
655 
652 
654 
653 

653 < 



-255 
•265 

-27.0 
•127 
•22 4 
-205 
-14.1 
-74 
-7.5 
0.0 
-1.0 
-2.0 
-30 
-4.0 
•5.0 
-6.0 
•7.0 
-8.0 
-0.0 
-10.0 
-11.0 
-12.0 
-13.0 
-14.0 
-15.0 
-16.0 
-17.0 
-16.0 
-19.0 
•20.0 
-21.0 
•22.0 
-23.0 
-24.0 
•25.0 
-26.0 
-275 
-28.0 
-29.0 
-30.9 
-315 
•32.0 
■33.0 
•34.0 
35.0 
-35.0 



50.800 

50.000 

51 .200 

53.900 

54.200 

54.400 

40.200 

41.200 

40.400 

42.900 

42.600 

42.700 

42.800 

42.800 

42.600 

42.400 

42.600 

42.600 

42.500 

42.600 

42.600 

42.600 

42,700 

42.900 

42800 

43.300 

42.900 

43.100 

42900 

43.000 

43.000 

43.100 

43.300 

43.500 

43.700 

43.800 

44.200 

44.400 

45.200 

45.300 

45.000 

45.900 

46.100 

46.00O 

46.100 

46.100 
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hemoglobin (Hb) ™»*>W prottm tuadirtx: fabbit 0lttcle C?K M(J buffua 



efm l«cr,roi«,ii, 



929 




7 
7 
7 
7 
7 

7 

7 

7 

7 

7 

7 

7 

7 



8 
6 
8 
8 
8 
8 
8 
8 
6 
8 
8 
8 
8 



9 


11 


3 


9 


10 


3 


9 


9 


3 


9 


8 


3 


9 


7 


3 


9 


6 


3 


9 


5 


3 


9 


4 


3 


9 


3 


3 


9 


2 


3 


9 


1 


3 


9 


0 


3 


9 


0 


3 



1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 

0 



7.18 
6.79 
6.S3 
632 
6.13 
5.96 
5.78 
5.59 
5.37 
5.14 
4.91 
4.71 
4.54 



-1.8 
•3.2 
-5.3 
-7.2 
-10.0 
-123 
-15.5 
-16.0 
-21.0 
•25.5 
•27.2 
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Table 4. Computed pft of tome toom proteins rtUud to measured CMC pt% 




Protein Name 



Plfl «ASP#GLU #H1S #LYS«ARG^Sir 
Heme 3J 4.1 6J 10J 12£ 7l 



0 Creaine phosprto tonase (CPK). raDbrt musde KIRBCM 28 27 17 

1 Fany aafrbintf ng prmein. rat hepaac F2RTL 5 13 2 

2 b2-microgtobulin, human MGHUB2 7 8 4 

3 Camamoy^phosonaie synthase, rat SYRTCA 72 96 28 

4 Prealbumin ( serum aloumir precursor), rat ABRTS 32 57 15 

5 Serum atoumm. rx ABRTS 32 57 15 

6 Superowd dismuase (Cu-Zn. SOD), rat A26810 8 1 1 10 

7 Phospnotipase C. pnophoinosrooe-specific (?). rat A28807 34 42 9 

8 Alpumin. human ABHUS 36 61 16 

9 Apo A-l lipoprotein, rat A24700 16 24 6 

10 proApo A-l lipoprotein, human LPHUA1 16 30 6 

11 NAOPH cytochrome P-450 reductase . rat RDRT04 41 60 21 

12 Retmo! binding protein, human VAHU 18 10 2 

13 Actin Data, rat ATRTC 23 26 9 

14 Actin gamma, ra: ATRTC 20 29 9 

15 Apo A-l lipoprotein, human LPHUA1 16 30 5 

16 Apo A*IV lipoprotein, numan LPHUA4 20 49 8 

17 Tubulin alpha, rac UBRTA 27 37 13 

18 FlATPase beta, bovine PWBOB 25 36 9 

19 Tubulin beta, pig UBPGB 26 36 10 

20 Protein disutphioe tsomerase (POI). rat hepatic ISRTSS 43 51 11 

21 Cytochrome 05, rat CBRT5 10 15 6 

22 Ado C-ll liooorotetn. human LPHUC2 4 7 0 



34 

16 
6 
95 
53 
53 
9 
49 
60 
23 
21 
38 
10 
19 
19 
21 
26 
19 
22 
15 
51 
10 
6 



16 
2 
5 

56 

27 
24 
4 
21 
24 
12 
17 
36 



18 
16 
24 
21 



6.84 
7.83 
6.09 
5.97 
5.9B 
5.71 
5.91 
5.92 
5.70 
532 
535 
5.07 



14 5.04 
18 5.06 



5.07 
5.10 
4.68 
4.66 



22 4.80 
22 4.49 
9 4.07 



4.59 
4.44 



Ami 

04 

•34 

•5.0 
-5$ 
•6J 
-9.0 
-9.2 
-9.2 
•114 
-137 
•lO 
-154 
-164 
•17.2 
•164 
-174 
•19.7 
•194 
•21X 
-224 
•254 
-264 
•304 



Amine ac=C pt assumed m calulation: 



3.9 4.1 6.0 10.8 12.5 




S t 5 2* 



it 



f fr 3 e e 

illiillMlil 1 
i Ms I * 





c 

ll 



to 
9 



S 



CO 



z 
> 

€0 



fills 

' 55 I I J 



2 2 3 i 

S § SS S S 2 2 



CO 

3 & 



o 

g 



i I 

CO to 
9 9 



s 5 a « 

I'll i 

5 * * i 
S'SSS 



o 

UJ 



0. 

o 

CO 

a 



CO 

z 

Uf 

o 

cc 

J 



i 3 

co co 
9 9 



2 
2 

0 



X 

O 



is 

2? 



_ a 

< S 5 3 

O CO 0 g 

a >o d 2 

a ouco it. 

co coco co g 

9 oo 9 fi 



926 



Y CPW SOSMW 



MSN 
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761 

763 

764 

765 

766 

767 

766 

760 

770 

771 

773 

775 

776 

777 

778 

779 

780 

764 

765 

766 

767 

780 

781 

782 

783 

7*4 

786 

787 

786 

799 

600 

801 

602 



1399 
1416 
2020 
651 
1052 
1966 
1330 
197D 
657 
1337 
1576 
969 
1436 
1539 
650 
700 
1052 
1413 
1364 
1622 
603 
616 
451 
777 
1536 
1461 
368 
1126 
833 
1420 
1750 
624 



605 
606 
607 



1775 
573 
203 
980 
802 
625 
1651 
440 
1356 
651 
745 



733 
1065 
566 
475 
1149 

665 
613 
617 
974 
502 
624 
706 
456 
434 
411 
1136 
529 
665 
635 
392 
682 
1429 
377 
1S43 
807 
546 
212 
437 
563 
279 
665 
547 
1468 
196 



810 
811 
812 
613 

614 2026 

815 1066 

616 629 

817 1376 

618 1771 

619 1045 
820 964 

621 1712 

622 1256 

623 1517 

624 1442 

625 1240 

626 1309 

627 2012 
828 837 
630 1342 
831 562 

632 1073 

633 461 



634 
837 
636 



501 
751 
635 



639 1494 
1952 



641 
842 



1565 
571 



643 1325 

644 1727 

645 630 

646 2016 

647 673 1200 



1039 
306 
827 
1015 
S73 
249 
393 
1246 
610 
645 
313 
1177 
790 
263 
362 
279 
205 
6S4 
449 
513 
1014 
706 
1405 
756 
626 
1039 
620 
561 
746 
633 
459 
301 
1060 
1312 
649 
301 
679 



•63 
-59 
>0.0 
•203 

•11.1 
>O0 

-7.1 
>0.0 
-15.0 
•7.0 
•3.7 
•12.6 
•5.5 
-4.2 
-15.1 
-19.1 
-11.1 
-6.0 
-6.7 
-0.9 
-14.3 
-22-0 
•29.6 
•16.9 
•4.2 
-5.1 
•33.6 
-9.6 
•13.5 
•5.9 
-1.6 
-21.7 
-14.2 
•1 .4 
•24.0 
<-35.0 
-12.5 
-14.1 
-21.7 
-0.7 
-30.9 
-66 
-15.1 
-17.6 
>0.0 
-10 4 
-21.6 
-6.5 
-1.4 
•11.2 
-12.4 
-2.2 
•6.1 
-4 4 
-5.5 
-6.3 
-7.4 
>0.0 
-13 4 
-7.0 
-24.5 
-10.7 
-28.6 
-27.8 
•17.6 
-21.3 
-4.7 
>0.0 
•3.6 
'24.1 
•7.2 
-2.0 
•213 
>0.0 
-16.8 



41J00 

27,300 
51.400 
50.300 
25.000 
59.900 
44.300 
48.500 
48,200 
31.500 
56.700 
37.600 
43.100 
61J0OO 



66.800 
25300 
54.400 
35.000 
37.100 
68300 
35.100 
15.400 
72.000 
11.700 
36300 
53.100 
133.700 
63.400 
49.800 
96.500 
35.600 
53.000 
14.200 
146.400 
57.400 
29.000 
87.200 
37300 
29.800 
51.100 
109.700 
69.400 
21.600 
38.200 
46.500 
65.700 
24.000 
39.100 
103.100 
74.600 
96.700 
139.200 
46.000 
62.000 
55.800 
29.900 
43.100 
16.200 
40.700 
37300 
29.000 
37300 
50.500 
41.100 
37.200 
60.900 
69300 
27.500 
19.400 
46.300 
89300 
44.600 
34.200 
23.200 



650 
651 
652 
855 
656 
857 
658 
659 
860 
861 



865 



670 
871 
672 
873 
674 
875 
876 
677 
878 
878 
880 
661 



1863 
1166 
1535 
1035 
634 



1063 
867 
1446 
706 
1070 
472 
674 
1307 
645 
827 
665 
1607 
1323 
1226 
1804 



271 
523 
1024 
626 
542 
220 
184 



687 
666 



1540 
1566 
1186 
1076 
1161 

647 
1756 
1543 
1432 

822 
1103 
1501 

786 



890 

891 

692 

884 

895 

896 

887 

886 

899 

900 

901 

903 

904 

905 

907 

908 

910 

911 

913 

914 

916 

917 

919 

820 

821 

923 

824 

82S 

926 

827 

828 

829 

831 

632 

833 

834 

836 

837 



851 
717 
1123 
881 
1245 
1962 
1322 
420 
662 
645 
624 
831 
799 
765 
775 
866 
828 
661 
1544 
1606 
1237 
1442 
1260 
764 
1133 
1123 
829 
1131 
1441 
679 
1467 
1062 
1231 
1606 
610 
065 
947 
665 
1421 



639 
311 
1066 
347 
480 
499 
867 
1004 
464 
402 
763 
1031 
346 
647 
756 
777 
351 
720 
1111 
757 
564 
278 
690 
669 
414 
607 
1103 
634 
759 
546 
229 
413 
234 
346 
626 
570 
426 
243 
703 
1094 
229 
520 
669 
824 
1303 
1544 
301 
387 
666 
748 
367 
1541 
1123 
380 
242 
316 
674 
216 
1191 
775 
616 
670 
800 
520 
462 
643 
1056 



•0.6 
•63 
•43 
-11.4 
-153 
•273 
-10.6 
-14 4 
-54 
-16.6 
•10.7 
-283 
•163 
-7.4 
•213 
-153 
•163 
-1.0 
•73 
•6.4 
-03 
-243 
•43 
-33 
•63 
-10.6 
•83 
-203 
•1.6 
-4.1 
•5.7 
-13.7 
-10.1 
-46 
-163 
-213 
-13.1 
-18.6 
•83 
-143 
•83 
>0.0 
-73 
-31.4 
-203 
•153 
-21.7 
-13.5 
•163 
-173 
-17.0 
•14.4 
•15.6 
-19.7 
-4.1 
•33 
-63 
•S3 
•6.0 
•173 
-6.7 
-93 
•15.6 
-97 
•5.5 
■19.7 
-46 
•10.5 
•6.4 
•33 
16.0 
123 
133 
143 
•S3 



69.500 
54.900 
29.600 
37.500 
53.400 
127.100 
150.500 
34.800 
46.800 
66300 
28.000 
77.600 
56.600 
57300 
34.800 
30300 
57.400 
66.000 
39.400 
29300 
77.700 
46.400 
40.700 
39.700 
76.600 
42.500 
26.400 
40.700 
49,700 
97.100 
34.600 
44.100 
■ 66.400 
46.800 
26.600 
47300 
40.600 
52.900 
121300 
66.400 
117.800 
77.700 
47.700 
51300 
64.500 
113.000 
43.400 
27.000 
121.000 
55.200 
34.600 
37.600 
18.700 
11.700 
89.100 
70.400 
44.100 
41.100 
73.700 
11,700 
25.600 
71.500 
113300 
64300 
35.400 
128.200 
23.500 
39.600 
38.000 
45.100 
34.400 
55.100 
60.600 
36.600 
28.400 



sow 



841 
942 
943 
844 
845 
846 
847 

»a 

849 
950 
851 
852 
854 
855 
857 
858 
860 
861 
962 
963 
964 
665 
966 
967 



970 

971 

972 

974 

975 

976 

977 

976 

979 

960 

661 

963 

964 

965 

967 

968 

990 

991 

992 

993 

994 

995 

996 

997 

996 

999 

1000 

1001 

1002 

1003 

1006 

1007 

1009 

1010 

1011 

1012 

1013 

1014 

1015 

1016 

1017 

1016 

1020 

1021 

1022 

1023 

1024 

1025 



1187 
1765 
602 
312 
883 
1300 
630 
167 
1360 
1766 
1038 
660 
857 
503 
1836 
1010 
766 
586 
557 
667 
564 
869 
671 
1204 
910 
609 
1285 
822 
976 
403 
279 
644 
1124 
994 
1612 
749 
1064 
1197 
1762 
1344 
1024 
739 
616 
785 
1159 
1090 
1030 
847 
902 
888 
1615 
1205 
617 
966 
970 
1736 
643 
822 
875 
291 
1386 
459 
679 
1818 
1032 
1629 
1311 
1722 
1015 
1574 
781 
1129 
612 
785 
1290 



637 
865 

472 
466 
491 
269 
423 
736 

344 

665 
193 
1S2 
701 
547 
712 
616 
174 
419 
409 
320 
334 

1156 
255 
786 
154 

1048 
206 
232 
437 
567 
495 
861 



642 
1141 
642 
811 
1506 
317 
1105 
1159 
555 
361 
317 
928 
701 
811 
461 
647 
579 
504 
289 
290 
771 
478 
1164 
487 
279 



745 
541 
661 
1128 
634 
994 
1134 
424 
743 
1219 
464 
63 
317 
446 
739 



•6.8 
•13 
•22.7 
O5.0 
•12.1 
•7.5 
-216 
<-35C 
•65 
•13 
•11.3 
•143 
•13.0 
-27.6 
>0.0 
-113 
-173 
-230 
•248 
-14 4 
-24.5 
•123 
•20.0 
-6.7 
-133 
-223 
-7.7 
•158 
-123 
-32.6 
<-35.0 
•15.3 
-93 
-12.1 
-33 
-17.7 
-103 
•63 
-1.6 
-6.8 
-113 
-17.8 
•1*3 
-16.7 
•9.3 
•10.4 
-11.5 
•153 
•14.1 
•14.4 
4)3 
•6.7 
-22.0 
-12.8 
-12.7 
-1.9 
•21.1 
-153 
-14.6 
-35.0 
-6.4 
•29.4 
-19.7 
-0.9 
-114 
-3.0 
-74 
-20 
-11.7 
-3.7 
163 
-9.7 
-153 
16.7 
-7.7 



35Jtt 

57.1* 
$7.7* 
100^ 
65.1* 
4 »JCC 

'S1.0QC 
213.00D 

53.0* 
42.9* 
37.9* 

174.9* 
65.7* 
57.1QC 
83.9* 
60.500 
24J00 

106.6ft 
38.700 

210J0C 
26.70Q 

138.900 

119J0C 
63.40C 

si ear 

57.4* 
3130C 
91.100 
45.400 
46.700 

46.700 
33.900 
12300 
84.700 
26.600 
24 600 
52.40C 
74.900 
64300 
33300 
43.400 
38300 
60.760 
36.600 
50.760 
56.500 
93.100 
92.700 
40.000 
58.900 

23.700 

58.100 

96.400 

46.600 

4130 

S3.S00 

45.600 

25300 

47.200 

30.700 

21500 

65.000 

41.300 

21S00 

56.40° 

581.* 
84.600 

41.68° 
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MSN 



Y CPK* SOSMW 



259 1796 

260 661 

261 1725 

262 406 

263 1063 

265 1390 

266 510 

267 660 
266 00 

268 1044 

270 2016 

271 657 

272 6B6 

274 12B2 

275 1350 

276 1670 

277 668 
276 961 
279 679 
261 1646 

282 1505 

283 1313 
264 1314 

285 1332 

286 1277 
268 1391 

289 1147 

290 925 

291 787 

292 1462 
283 531 



961 
1361 
879 
1127 
172 
673 
437 
1038 
961 



284 660 

295 1162 

296 216 

297 1377 

299 913 

300 2012 



853 
422 
968 
712 
560 

1089 
538 
716 
570 

1084 
525 

1147 



652 
824 
579 
511 
1476 
818 

44S 



301 
302 
303 



702 
494 
403 

304 1S43 

305 1049 

306 1606 

307 1«9 
306 1627 

309 1524 

310 1760 

311 1609 

312 2G6 

313 1902 

314 1316 

315 1341 

316 1104 
320 1480 



609 
614 
979 
1523 
667 
178 
1280 
1006 
1585 
5B3 
989 
916 
755 
682 
1028 
1451 
1408 
1365 
1395 
523 
1053 
1456 



321 


850 


603 


322 


1454 


1494 


323 


670 


626 


324 


6S5 


101 


325 


1521 


675 


326 


1567 


677 


327 


1368 


406 


326 


446 


1291 


330 


1608 


751 


331 


1566 


607 


332 


531 


471 


333 


7B4 


1156 


334 


1059 


407 


335 


1593 


303 


336 


1616 


598 


338 


1854 


1004 


339 


1265 


688 


340 


561 


585 


341 


1497 


1047 


343 


1351 


265 


344 


1613 


549 



•1.1 

•204 
•2.0 
-28.0 
-10.9 
44 
-274 
•204 
•31.0 
-11.2 
>0.0 
•15.0 
-14.2 
-7.6 
•6.9 
•2.6 
•19.4 
-13.0 
-14.5 
•0.7 
-4.6 
-74 
-74 
•7.1 
-7.6 
44 
-9.5 
•13.6 
•16.6 
•5.1 
•26.3 
-14.9 
44 
<*35.0 
44 
-13.9 
>0.0 
-19.0 
-28.1 
•32.6 
•0.7 
-11.1 
-3.3 
•6.5 
•3.0 
-44 
-1.5 
44 
<-35.0 
-0.3 
-7.3 
-7.0 
-10.1 
-49 
-15.1 
-5.3 
-20.0 
•20.6 
-4.4 
•3.6 
•6.3 
-30.0 
44 
-3.6 
•26.3 
•16.7 
-10.9 
-3.5 
-3.2 
-06 
4.0 
-23.6 
-4.7 
•6.6 
•0.9 



31.900 

17.700 
44.800 
25.600 
177.400 
45.000 
63.400 
29.000 
31.900 
46.900 
36.300 
65.200 
31,700 
42.900 
49,900 
27.100 
53.700 
42.600 
51.300 
27.300 
54.800 
25.100 
37.400 
67.200 
46.100 
37.000 
50,700 
55.900 
13.900 
37.800 
62.000 
43.600 
46.700 
38.000 
31,300 
12.400 
45.300 
169.200 
20.400 
30.100 
10.300 
49.800 
30.900 
33.700 
40.700 
34.700 
29.400 
14.700 
16.100 
17.600 
16.600 
54.900 
28.500 
14.400 
49.100 
13,300 
47.700 
420,500 
44,800 
44,700 
67.000 
20.100 
40.900 
43.700 
59.600 
24.700 
67.300 
86.500 
49.400 
30.300 
34.900 
50.300 
28.700 
102.200 
52.800 



345 1QQ8 

346 1005 

347 62S 
3a 361 

349 110 

350 621 

351 912 

352 1574 

353 661 

354 706 

355 1450 

358 1374 
357 474 

356 798 

359 764 

360 1364 

361 1713 

362 1181 

363 914 

364 412 

365 741 

366 878 

367 1560 

368 963 

369 434 

370 639 

371 1587 

372 187S 

373 1351 

374 1506 

375 1823 

376 254 

377 1409 

378 621 

379 1017 
361 653 

382 856 

383 12S2 
364 1609 

385 1042 

386 1490 

387 1554 
386 1163 

389 1374 

390 1456 

391 716 

392 1799 

393 1482 

394 1227 

395 1530 

396 1410 

397 912 

399 1465 

400 1473 

401 1029 

403 1516 

404 1495 

405 1525 

406 723 

409 650 

410 1501 

411 836 

412 350 

413 1033 

415 737 

416 1576 

417 646 

418 1695 

419 725 

420 1289 

421 1171 

422 569 

423 929 

424 739 

425 1490 



578 
640 
728 
963 
1343 
1130 
619 
830 
612 
762 
830 
1152 
607 
346 
338 
1066 
789 



1166 
435 
486 
1603 
635 
620 
441 
610 
660 
762 
1050 
715 
532 
417 
563 



596 
674 
256 
1518 
483 
563 
603 
404 
902 



732 
758 
1461 
577 
755 
256 
1063 
450 
1140 
754 
554 
1092 
252 
663 
478 
1057 
1120 
538 
425 
606 
496 
462 
770 
1041 
912 
162 
656 
625 
965 



-11.9 
-104 
•21.7 
-354 
«45.0 
•26.7 
•13.9 
•3.7 
•124 
-16 9 
44 
-64 
-28.7 
•16*3 
-174 
•64 
-2.1 
•6.3 
•13.6 
•324 
-174 
-144 
•3.9 
•12.4 
•31 JO 
-21.2 
-36 
•04 
44 
-4.6 
•0.9 
<«35.0 
-6.1 
-21 J 
•11.7 
-13.1 
•154 
4.1 
-24 
•114 
-4.7 
-44 
•8.6 
-64 
-54 
•18.5 
-1.1 
-44 
•84 
-44 
-6.0 
-13.9 
-50 
•44 
•11.5 
-44 
-4.7 
•44 
-18.4 
-204 
-4.6 
-13.4 
-354 
-11.4 
-18 0 
•3.7 
-21.0 
-24 
•18.3 
-7.7 
-9.1 
-224 
•13.6 
•174 
<7 



50,600 
46.800 
42.000 
31.100 
16400 
25.700 
48.100 
54400 
33.000 
40.400 
37400 
24.900 
30.600 
77400 
79.400 
27.900 
40.100 
36.100 
24.600 
63.700 
58400 
13.000 
33.000 
55400 
63.000 
48.700 
36.100 
40.400 
28.300 
42.700 
54400 
65.900 
50.400 
57400 
49.600 
49,400 
44.900 
105400 
12400 
57400 
50.400 
49.100 
67.700 
34.300 
31.700 
44.000 
41.900 
40.600 
14.400 
50.600 
40.800 
106.400 
28.100 
61.900 
25.300 
40400 
52.500 
27.100 
108.000 
45.500 
56.000 
28.300 
26.000 
53.700 
64.900 
48.900 
57.300 
56.600 
40.000 
28.900 
33.900 
163.700 
36400 
47.700 
31.800 




426 
427 
428 
429 
430 
431 
432 
434 
435 
436 
437 
436 
439 
440 
441 
443 



447 



450 
451 
452 
453 
454 
456 
457 
459 
460 
461 
462 
463 
464 
465 
466 
468 
469 
470 
471 
472 
473 
474 
475 
476 
477 
478 
479 
480 
462 
463 
465 
466 
487 



1 

610 

1565 
1259 
1253 
734 
463 
516 
1020 
1122 
1670 
435 
66 
1740 
599 
743 
601 
1050 
1245 
1576 
1816 
1094 
1945 
1652 
1400 
1394 
905 
1038 
1566 
1528 
1096 
649 
1614 
1368 
1194 
577 
1140 
1797 
1283 
618 
2006 
1205 
1035 
160 



JO* 
643 

303 
647 

562 
1426 
433 
1CM1 
1170 
196 
673 
1102 
647 
544 

1571 
335 
668 
926 

1298 

1516 

1021 



802 
884 

500 
716 
436 
581 



1137 
1125 
1072 

461 
1064 

467 



490 

491 

492 

493 

494 

495 

496 

497 

499 

500 

501 

502 

503 

504 

505 

506 

507 

506 

509 

510 



599 
1009 
1216 
816 
683 
1608 
478 
1025 
1045 
1609 
775 
692 
1100 
1760 
682 
470 



1414 
1234 
1246 
824 
1246 
1115 
1189 
1578 
787 
979 
1153 
1730 



524 
1133 
655 
299 
215 
786 
155 
1370 
662 
540 
235 
346 
673 
1013 
599 
607 
1186 
301 
1289 
178 
964 
776 
247 
1256 
1436 
852 
546 
1072 
659 
792 
1134 
1407 
391 
402 
250 
552 
619 
1006 



7* 
-164 
•34 
•6.0 
•6.1 
•16.1 
26 5 
•269 
11.6 
•9.8 
•04 
•31.0 
O5.0 
•14 
224 
•174 
-164 
-11.1 
4.2 
•3.7 
-04 
-104 
>0.0 
-24 
•6.1 
-64 
-14.0 
-114 
•34 
•4.3 
•10.2 
-154 
•0.9 
•64 
44 
-23.9 
4.6 
•1.1 
•74 
-214 
>0.0 
4.7 
-1V4 
<35.0 
-28.9 
-224 
•11.6 
4.6 
-154 
-194 
-34 
-28.6 
-114 
-114 
-3.3 
-17.0 
•194 
•104 
•1.6 
-14.5 
-28.9 
•28.1 
-124 
4.0 
43 
4.2 
-15.7 
4.2 
-9.9 
4.9 
3.7 
-16.6 
-12.5 
-9.4 
-20 



2* 

*4* 

"Jet 

*4Ct 
W7.(K 

28.** 
10.«x 

60.10: 

»4Ct 
19-6CC 
12J0C 
2B.OC 
63.1* 
384K 
34.CD 
S6J0C 
42.800 
6340E 
50 SOD 
91.40C 
15 «C 
254Z 
254K 
27400 
50.700 
27400 
60.100 
34.900 
54400 
25.900 
46.000 
89.900 
131400 
3940C 
207.600 
17.400 
45.800 
53400 
117.400 
77400 
44.900 
30.000 
49400 
46.» 
23.700 
69.200 
20.100 
169400 
31J0D 
39.700 
110.700 
21,9 

3*400 
5X160 
27J00 

41700 
39 CDC 

21*0 
16400 
6*706 

1086* 

stott 

4*101 
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Fffiw /J. Data on a second core, u u,.- 
frouroripou.prescmedasir.Fij |J T>* 
fourth experimental group Oovasuur* 
shows « mode« induction, while the fife 
roup (lovastatin plus cholestyramine, 
does not. 




^ U D *H on spot MSNJ67, presented as in Fig. 1 1. This P*** 
•^unambiguously the anti-synergistic eiTeet oflovasuun and*** 
?I1 (ntih ,foup) *» » towsuiin (fourth group). T** m 

ponse contrasts strongly with the regulation pattern seen in Fig- ^ * 
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13* 

313 
atT 

11»4 

12S3 
742 
7K 
121£ 
1145 

ion 
as 

712 



it iics 

182* 

123 
13T 



. 1113 

. 1» 
. 735 
2091 
722 
C71 
1CB2 

ior 

1171 
1400 

*« iasa 

g> 10BB 
735 



1253 
779 
1064 

* 1582 
2*. 1570 

1254 
£t«5 
7*1533 

V s mr 

. • 525 
?!■ 

It; Itll 

?C-t412 

?V w 
1582 
1506 
li 1»17 
!• *1« 
g*15BB 
"-1705 
651 
It "IS 

1705 

naMeW; 



position 



a 

Si 

=5 



o 



« «» 



T 

•30 



•2D 



•15 



r 

•10 



Gel Y Coordinate 

ftg*r*S. Plot of number of amino acids versus gel K D0 ,-H ftn 
curve used .o pred.n mo.eeul.r mas, of -nideCe^ro^ 



CPK position 




M Fwe Z (a) Plot of computed isoelectric ooini vmut ..t y 



/V/vr P. Montage showing effects in uk 
region of MSN:4i3.The montage sh©**« 
•null window into one portion of the 3-0 
pattern, one row of windows for each en** 
rimenuJ group, and one panel for each f*l 
in the experiment. The left-most patten 
in each row is a group-specific copy of thf 
master pattern followed by the paitenv 
for the five individual rats in the group. 
The highlighted protein spots (filled ore* 
let) are spot 412 (on the righi of each p*» 
el; identified as cytosolic HMG*CoA 
Uitse) and two modified forms of ii 
*nd 933). From the lop. the rows (ft**"* 
mental groups) are: high cholesterol. 
trols, cholestyramine, lova statin, and fc" 
statin plus cholestyramine. 
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ughtfy coreiuUted enzymes. A second group of six snou 
was selected based on a reg ulatory pattern dose to th* £ 
verse of that forspo. 413 (MSKs 34 79. 178. E£?J4T 
data no. shown). For these proteins, the low«t ieveVof 
presnon , occurs with exposure to lovastatin pSs 33^2* 
mine and the highest level upon exposure to the hirt^!.* 

l, v ,k w Ch " ge / pan at At molecular weighriev 
may thus be isoforms of a single protein. The other faZ 
*pots probably represent additional oi ^SiSS. 

332 MSN 235 and eoregulated spots 

dr!S ir l!.? UP ° f f '? f ? ots ' main, >' e <>»P«ed of mitocbon- 
dnal proteins including putative mitochondrial HMG- 

CoA synthase s P ott. $ howedaoodestmduction^lovisu. 

ments (including the combination oflovasutin and choS- 
tyramine; F,g. 12).This result is intriguing becauTe 1*2" 

.no esterol synthesis, which is entirely extra-mitochon- 
anal Three of the spots (235. 134. 144) form a dwell- 
packed triad at approximately 30 kDa. and arelikely to re. 
present isoforms of one protein. All three spou are SataS 
by an antibody to the mitochondrial foim oTHmSS! 

^ «« 3 m " ocb °ndrial location. The otheTtwo 
spots (633 at about 38 kDa and 734 at about 69 kDa) ™ 
each present a, lower abundance than the »e5£rfS 



proteins of the puutive mitochondrial nath 

amtnitton of all the eoregulated p^uS^i- 
waive statistical techniques can extract a" e a?.h r qu *»- 
esunginformation from large sets of reproduce « f, ,n,t '- 
abundance of spots in the 413 coregulatkn gro ilr* ^ 

ple.showsanam«in g levelofconMrdanceYn th ;,^ 
expression among the five individuals of thMov re,a,,4 t 
cholestyramine treatment group. This effect i, ? U " n ** 
differences i« itotal protein uJU-^tolS*** 
been removed by scaling, and since proteSs u?, h ' 4 ' read > 
ferent regulation patterns can be dmm£££ 

lauon sets may be revealed through the K*'^ 
aotuy targe population of control \&£t«*£* 
any experimental manipulation). This annroa,* ^ 0us 
natura (biological variation in protean ex Pre S P, ° Hint 

uon of a large library of control an£i pauem, """^ 
4 Conclusions 



Because of the widespread use of rat liver in k^l 

comprehensive database of liverprotehS S£ ™ £££ '* 
ier pattern presented here has proven to be „ 
presenution of this system baK b~„ tf - 

3^sS?s f^ssn tt r d £~ «52 

T ., . we expect this database to contribute niu. 

able mstghts into gene regulation. Its pJSS^kSZ 

Received September II. I99| 



333 An example of u anti-synergisUc effect 

(two- to threefold), and about half as much induction 

SFS* ft* Ch0lest y ra »i". but wiKSSSS 
mal-animal heterogeneity pattern of the 235-sVt fFia S 

2S2 "f* 150 m »°*0"1Hal..„d reprieS 

3.3.4 Complexity of the cholesterol synthesis pathway 

Pathway^ vS^MGO^^T^^'^ ttMl,a ^ 
other ha„ "dieralfn ?Zt ^'^'"^ine. on the 
produces a stron/.f? ? or m combination with lovastatin. 
bur iSlVor S Z^VXT" Cyt ° S ° lic pathwa * 
way. An explanatS fStto SE * mw *? a «W* Pu- 
tin's effect on levei of HMG ? ? nCe . may Uc in ,ovasu " 
compounds U « «£anlf^ lnd "L aled precursor 
the nutochondrion whefeSSiJ! WM ° • the 5 10 "' 
onlymecytosolicnathwV^- l" tyraai,nc $hould 
ol and M^u^^J!!^?*™" 1 »> v cholester- 
0 ,eve,s - " rema ">s to be explained whysome 
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some of the principles of the earlier TYCHO system [34- 
*\l Procedure PROC008 is used to yield a spoUist giving 
position, shape and density information for each detected 
spot. This procedure milts use of digital filtering, mathe- 
matical morphology techniques and digital masking to re- 
move the background, and uses full 2-D least-squares opti- 
mization to refine the parameters of a 2-D Gaussian shape 
for each spot. Processing parameters and file locations are 
stored in a relational database, while various log files detail- 
ing operation of tbe automatic analysis software are ar- 
rived with the reduced data.The computed resolution and 
prvel of Gaussian convergence of each gel are inspected 
archived for quality control purposes. 

Experiment packages are constructed using the Kepler ex- 
periment definition database to assemble groups of 2-D 
patterns corresponding to tbe experimental groups (e,g 
treated and control animals). Each 2-D pattern is matched 

- Jwe-rt ppropriate ^^er" 2-D pattern (pattern 
r^4M5T3 in the case of Fischer 344 rat liver), thereby 
trovidmg linkage to the existing rodent protein 2-D data- 
exses. The software allows experiments containing hun- 
crtds of gels to be constructed and analyzed as a unit, with 
:o 100 gels displayed on the screen at one time for com- 
bative purposes and multiple pages to accommodate ex- 
penmems of > 1000 gels. For each treatment proteins 
sowing significant quantitative differences vs. appropriate 
controls are selected using group-wise statistical parame* 
ters (e.g.. Student s t-tesu Kepler* procedure STUDENT). 
Proteins satisfying various quantitative criteria (such as P< 
C.001 difference from appropriate controls) are repre- 
sented as highlighted spots onscreen or on computer-plot- 
ted protein maps and stored as spot populations (/.^logi- 
cal vectors) in a liver protein database. Quantitative data 
(spot parameters, statistical or other computed values) are 
stored as real-valued vectors in the database. Anal vsis of co- 
regulanon is performed using a Pierson product-moment 
correlation (Kepler procedure CORREL) to determine 
whether groups of proteins are coordinate^ regulated by 
any of the treatments. Such groups can be presented graphi- 
cally on a protein map, and reported together with the statis- 
tical criteria used to zsstss the level of coregulation. Multi- 
variate statistical analysis (e.g., principal components' ana- 
lysis) is performed on data exported to SAS (SAS Institute) 



2.6 Graphical data output 

Graphical results are prepared in GKS and translated 
within Kepler* into output for any of a variety of devices 
Linedrawmg output is typically prepared as Postscript and 
printed on an Apple LaserWriter. Detailed maps presented 
here have been generated using an ultra-high-resolution 
Postscript-compatible Linotronic output device. Greyscale 
graphics are reproduced from the workstation screen using 
a Seikosha videoprinter. Patterns are shown in the standard 

^KtelSP m ° ,CCU,ar ^ 31 thC l ° P 3nd aCidic 
2.7 Experiment LSBC04 - 

« a i?Wi1 dy . de$Cribe<1 her « 12-week-old Charles River 
male F344 rats were used. Diets were prepared at LSB 

.nH I 0 ," * Pu " na 5755M B «* J Purified Diet. Levasutin 
and cholestyramine were obtained as prescription pharma- 



ceuticals, ground and mixed with the diet at 
of 0.075% and J*, respectively. The hS "e2 eta,, « , «o* 
was Purina 5801M.A (5% cholesterol plus fSSS?" 2 
law in the control diet). Animal work was cuSSf"* 
crobiological Associates (Bethesda, MD) An £?. UX 
climatized I for one week on the control EtajfftT?* * 
ml diets for one week, and sacrificed on d'v g if f Con " 
daily doses of lovastatin and cholestyramine it a DB £ ef, ' ! 
groups were 37 mg/kg/day and 5 g/kg/da y 
based on the weight of the food consumed L £, Pect,vel >. 
were collected and prepared for2-D electrop nor « ^ mp,ei 
ing to the standard liver protocol fhom«... " a «ord- 
volumes of 9 m urea,2*Np2oi OsSKSl • ?" on » • 
LKB pH 9-11 c^^ZZ ^T;:Z 2 H 
uon for 30 mm at 80000 X Kidney, brain and * 
samples were frozen. Gels were run ii jE^K ,,, " ,B » 
and the data w« .muyzed using ^S^S^ 
were scaled, to remove the effect of differences ^ Gt ' 1 
loading, by setting the summed *uaX2 oSiS^ 1 *" 
ber of matched spots equal for each .eT^JSj* 

3 Results and discussion 
3.1 Toe nt liver protein 2-D map 

F344MST3 is a standard 2-D pattern of rat livtr ««..-• 
based on the Fischer 344 straiS. Tto i« wL EE 
from a single 2-D gel and extensively ediS m an S 
ment comparing it to a range of protein loads ""JK 

high-abundance spots.More than 700 rat Iiver2-Dpauerns 

and protem characterization experiments, and numerous 
new spots (induced by specific drugs, for in«Sto5 
been added as a result. A modified version including addi 
tional spots present ,n the Sprague-Dawley ombreo 1 I 
also been developed (data not shown). Figure 1 shows* 
greyscale representation and Fig. 2 a schematic plot of the 
master pattern. More than 1200 spots are included, most of 
which are visjble on typical gels loaded with 10 uLof sohibi- 
hzed hver protein prepared by the standard method and 
stained wjih colloidal Coomassie Blue. Master spot num- 
bers (MSN s) have been assigned to all proteins, and ap- 
pear in the following figures, each showing one quadrant of 
the pattern. Figure 3 shows the upper left (acidic, high 
molecular mass) quadrant. Fig. 4 the upper right (basic, 
high molecular mass) quadrant. Fig. 5 the lower left (acidic, 
low molecular mass) quadrant, and Fig. 6 the lower right 
(basic, low molecular mass) quadrant. The quadrants over- 
inn** * n *"* 10 movin « between them. Tbe gel position (in 
100 micron units), isoelectric point (relative to the CPK in- 
ternal pi standards) and SDS molecular mass (from the eafr 
bration curve in Fig. 8) are listed for each spot (Table 1). Be- 
cause of the precision of the CPK-p/ values, these parame- 
ters can be used to relate spot locations between gel sys- 
tems more reliably than using p/ measurements expressed 
as pn.A major objective of current studies is the identifica- 
tion of all major spots corresponding to known liver pro- 
teins, as well as rigorous definitions of subcellular orf** 
nelle contents. Of particular interest to us is the parallel de- 
velopment of identifications in the rat and mouse Ir** 1 
maps, allowing deuiled comparisons of gene expression tf- 
fecu in the two systems. The results of these studies will * 
presented systematically in a later edition of this daub** 1 
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lure tod the associated shift to strong selection for growth, 
and how do these a/Tea experimental outcomes? Hence 
the apparent advantages of in ww systems, in terms of ex- 
perimental manipulation, may be counterbalanced by 
other factors relating to 2-D data quality. 

There is a second imporunt class of reasons for exploring 
the use of an in vivo biological system such as the liver. His- 
torically, there have been rmo broad approaches to the me- 
chanistic dissection of biochemical processes in intact cel- 
lular systems: genetics (a search for informative mutants) 
and the use of chemical agents (drugs and chemical toxins). 
Both approaches help us to understand complex systems 
by disrupting some specific functional element and show- 
ing us the result. With the development of techniques for 
genetic manipulation and cloning, the genetic approach 
can be effectively applied either in vitro or in vivo, although 
the in vitro route ts usually quicker. The chemical approach 
can also be applied to either son of biological system; here, 
however, the bulk of consistently acquired information is 
in experimental animals (rats and mice). While most biolo- 
gists knows short list of compounds having specific, experi- 
mentally useful effects <e\g.. inhibitors of protein synthesis, 
ionophores. polymerase inhibitors, channel blockers, nu- 
cleotide analogs, and compounds affecting polymerization 
of cytoskeletal, proteins), there is a much larger number of 
interesting chemical!) -induced effects, most of them char- 
acterized by toxicologists and pharmacologists in rodent 
systems. Just as a thorough genetic analysis would involve 
saturating a genome with mutations, it is possible to ima- 
gine a saturating number of drugs, the analysis of whose ac- 
tions would reveal the complete biochemistry of the cell. 
While organized drug discovery efforts usually target spe- 
cific desired effects, the nature of the process, with its de- 
pendence on screening large numbers of compounds, ne- 
cessarily produces many unanticipated effects. It is there- 
fore reasonable to suppose that the required broad range of 
compounds necessary to achieve •biochemical saturation* 
may be forthcoming; in fact, it may already exist among the 
hundreds of thousands of compounds that failed to qualify 
as drugs. 

Among organs, the liver is an obvious choice for the study 
of chemical effects because of its well-known plasticity and 
responsiveness. Tbe brain appears to be quite plastic (e.g. 
[7]), but it is a complicated mixture of cell types requiring 
skillful dissection for most experiments. The kidney, while 
quite responsive, also presents a potentially confounding 
mixture of cell types. The liver, by contrast, is made up of 
one predominant cell type which is easy to solubilize: the 
hepatocyte, representing more than 95% of its mass. Most 
importantly, the liver performs many homeostatic func- 
tions that require rapid modulation of gene expression. It 
appears that most chemical agents tested affect gene ex- 
pression in the liver at some dosage (N. Leigh Anderson, 
unpublished obsenations). an interesting contrast to our 
earlier work with lymphocytes, for example, which seem to 
be much less responsive.Such results conform to the expec- 
tation that ceils with a homeostatic, physiological role 
should be more plastic than cells differentiated for a pur- 
pose dependent on tne<axtion of a limited number of spe- 
cific genes. 



has been made in the development of mous* ra . a 
man hepatonie culture systems, as well as in preci* n * 
tissue slices. Using such an array of techniques ff***'* 
ble to assemble a matrix of mammalian svstcms i nc i P0 * 
mouse and rat in vivo on one level and mouse rat and 
man in vitro on a second level, and to compare effect 
tween species and between systems. This approach aii ^ 
us to draw informed conclusions recardinc the bioche*. 
-universality- orbiological responses amone xht mamr^ 
and to offer some insight into the vaJiditv of tn 
proaches for toxicologic*! screening. We believe trmri* r " 
will be necessary if in vitro alternatives are to achieve u " 
usage tn government-mandated safetv testine of drucs 
sumer products and industrial and agricultural chemi^V 

A number of interesting studies have been published us~ 
2-D mapping to examine effects in the rodent liver a nJ- 
ber of tnvestigarors have made use of the tcchniou- - 
screen for existing genetic variants 18—1 1 ] or induced mu" 
lions II 2- 14 J. mainly in the mouse. This work builds on iiT- 
wealth of genetic information available on the mouse an- 
ils established position as a mammalian mutation-dei- 
lion sysiem. While some studies of chemical effects hat* 
been undertaken in the mouse [15-17). most have used th- 
rat [18-23]. The examination of the cytochrome p-450 $\ l 
tern, in particular, has been carried out almost exclusively 
on the rat [24. 25). 

These considerations lead us to conclude that rodent live- 
offers the best opportunity to systematical^ examine an 
array of gene regulation systems, and ultimately to build a 
predictive model of large-scale mammalian gene comro!. 
The basic underlying foundation of such a project is a reli- 
able, reproducible master 2-D pattern of liver, to which on- 
going experimental results can be referred. In this paper. we 
report such a master pattern for the acidic and neutral pro- 
teins of rat liverfpattern F344MST3).In future, this master 
will be supplemented by maps of basic protems.and analog- 
ous maps of mouse and human liver. 

2 Materials and methods 
2.1 Sample preparation 



The liver also allows the parallels between in vitro and in 
vivo systems to be examined in detail. Significant progress 



Liver is an ideal sample material for most biochemical stud- 
ies, including 2-D analysis. A sample is taken of approxima- 
tely 0.5 g of tissue from the apical end of the left lobe of the 
liver. Solubilization is effected as rapidly as practical: a 
delay of 5-15 min appears to cause no major alteration in 
liver protein composition if the liver pieces are kept cold 
(e.g., on ice) in the interim. In the solubilization process, 
the liver sample is weighed, placed in a glass homogenize 
(e.g., 15 mL Wheaton); 8 volumes of solubilizing solution* 

• The solubilizing solution is composed of 2 % NP-40 (Sigma). 9 w 
(analytical grade, e.g.. BDH or Bio-Rad). 0.5% dithiothreitol (V™ 
Sigma ) and 2 earner ampholytes (pH 9- 1 1 LKB : these come as a 
stock solution, so 2 % final concentration is achieved by making the 6»* 
solution 10% 9-1 1 Ampholinc by volume). A large baich of solu**** 
.(seven! hundred mL) is made and stored frozen at -80 # C in 
sufficient to provide enough for one day's estimated sample P 1 **** 
lion requirement. The solution is never allowed to become 
than room temperature at any suge during preparation orth*** 8 * . 
use, since heating of concentrated urea solutions can produce c0 *"*T 
atfltt that covalemly modify proteins producing artifactual 
shifts. Once thawed, any unused solubilizer is discarded. 
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N. Leigh Anderson 1 
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An updated two-dimensional gel database of rat liver 
proteins useful in gene regulation and drug effect 
studies 

We have improved upon the reference two-dimensional (2-D) electrophoretic 
map of rat liver proteins originally published in 1991 (N. L. Anderson et at.. 
Electrophoresis 1991, /2, 907-930). A total of 53 proteins (102 spots) are now 
identified, many by microsequencing. In most cases* spots cut from wet. Coo- 
mass ie Blue stained '2-D gels were submitted to internal tryptic digestion [2 J, 
and individual peptides, separated by high-performance liquid chromatography 
(HPLC). were sequenced using a Perkin-Elmer 477A sequenator. Additional 
spots were identified using specific antibodies. 



Figure 1 shows the current annotated 2-D map of F344 
rat liver, analyzed using the Iso-DALT system (20 X 25 
cm gels) and BDH 4-8 carrier ampholytes. Both the 
map itself and the master spot number system remain 
the same as shown in the original publication. Table 1 
lists the important features of each identification shown, 
including the gel position. pA and M, for the most 
abundant or most basic form of each protein. Using this 
extended base of identified spots, a series of four 
improved calibration functions has been derived for the 
p/ and SDS-A/, axes (the first two of which are shown in 
Fig. 2A and B). Both forward and reverse functions are 
derived, so that one can compute the physical properties 
of a spot with a given ge! location, or inversely compute 
the gel position expected for a protein having given 
physical properties: 

^ RAT LIVER ~ /w — R*TUV£* * '•If.SEOl'ENCE.DEftlvCD) 0) 
^RAT LIVER -RATLIVER X (P^SEOfENCE-DERrVEo) (2) 

GEL-DERIVED = AaTLIVER Y-H, (^RaTLIVEr) (3) 
P As EL-DERIVED = ./raT LIVER X-»l (^RaTLFVIr) (4) 

A spreadsheet program (in Microsoft Excel) was devel- 
oped to facilitate flexible computation of pfs from 
amino acid sequence data, and the results were entered 
into a relational database (Microsoft Access). A table of 
spot positions and sequence-derived pi's and Af/s was 
fitted with a large series of analytic equations using 
Tablecurve (Jandel Scientific), and the four conversion 
Eqs. OM4), relating computed pi and gel X coordinate, 
or computed molecular weight and gel Y coordinate, 
were selected, based on criteria of simplicity, goodness 
of fit and favorable asymptotic behavior. Table 2 lists the 
equations and coefficients. Application of Eqs. (3) and 
(4) to a spot's A' and Y coordinates, given in [1], produce 
improved M t estimates, and allow computation of pi 
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directly in pH units, instead of in terms of positions rela- 
tive to creatine phosphokinase (CPK) charge standards. 
The inverse Eqs. (1) and (2) were used to compute the 
gel positions of a series of pi and M, tick marks. These 
tick marks were plotted with SigmaPlot (Jandel), 
together with fiducial marks locating several prominent 
spots, and the resulting graphic was aligned over the syn- 
thetic gel image (computed by Kepler from the master 
gel pattern) using Freelance (Lotus Development). Maps 
were printed as Postscript output from Freelance, either 
in black and white (as shown here) or in color, where 
label color indicates subcellular location (available from 
the first author upon request). We have also used the rat 
liver 2-D pattern as presented here to calibrate the pat- 
terns of other samples. Using mixtures of rat liver and 
mouse liver samples, for example, we made composite 
2-D patterns that allow use of the rat pattern to standar- 
dize both axes of the mouse pattern. This was accompli- 
shed by deriving transformations relating the fat and 
mouse X, and separately the rat and mouse Y, axes 
(Table 2, lower half; Fig. 2C and D) based on a series of 
spots that coelectrophorese in these closely related spe- 
cies. These functions were then applied to derive equa- 
tions relating the mouse liver X and Yio p/and SDS-A/, 
(Eqs. 5 and 6 below). The resulting standardized 2-D pat- 
tern for B6C3F1 mouse liver is shown in Fig. 3. 

MOUSE LIVER ~ /raTLIVER Y— Mr (/mOUSE LIVER Y-RaTUVER V 

(^MOUSE LIVER)) (5) 

P^MOLtSEUVER ~ AaTLIVER X-#l (/mOUSE LIVER X— RAT LIVER X 

• C^MOUSE LIVE*)) (6) 

A slightly more complex approach can be used to stand- 
ardize samples that have few or no spots co-electropho- 
resing with rat liver proteins. In this case, a 2-D gel is 
prepared with a mixture of the two samples, and four 
functions (forward and backward, each for X and Y) are 
derived relating each sample's own master pattern to the 
composite. The required functions are then applied in a 
nested fashion to yield the desired result (using rat 
plasma as an example): 

^r*ATPLASMA ° AaTUVER Y— Mr (/lUT PLASMA • UV£R Y-RaT LIVER Y 

(/RAT PLASMA V-RAT PLASMA ♦ UV£» Y ( ^RATPLASMa))) 

(7) 
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Figure /. Master 2-D gel patiem or Fischer 344 rat liver proteins, annotated with 53 protein identifications and computed pi and M t axes. 
Tentative identifications are in italic type. 



Table 1. Proteins identified in ihe 2-D pattern of F344 rat liver 



MSN' 1 



Protein ID*' 



Protein name 



Identification comments 



Gel X* 1 Experimental Gel K* 1 Experimental 



126 HADO-HUMAN* 1 3-HA-3.4-DO: 3-hydroxy- Internal sequence 

aathranilate-3.4-dioxy- 







genase 


137. 159. 288. 


DIDH.RAT 


3HDD: 3-hvdroxysteroid 


258 




dihydrodiol reductase 


173 


MUP.RAT 


a>u globulin 


38 


ACTB.HUMAN 


Aciin 0 


68 


ACTG.HUMAN 


Actin y 


693 


AFAR.RaT 


Aflatoxin Bl aldehyde 






reductase 


28, 21. 33 


ALBU.RaT 


Albumin 


43 


DHAM.RAT 


Aldehyde dehydrogenase 


96 


ARGI.RAT 


Arginase 


117 


SUAR.RAT 


Ajylsulfotransferase 


1163. 1161, 


GR78.RAT 


BIP (GRP-78) 


116X20 




185 


CAH3.RAT 


CA-lII 


123 


calm.hu man 


Calmodulin 


3, 201, 48, 39, CRTCJlAT 


CaJreticuIin 


22. 24 





Ab (T.M. Penning) and pure protein 

Presence in liver microsome lumen, 

abundance in kidney, pi, M, 
Analogy with other mammalian patterns 

(e.g. human) through coelectro phoresis 
Analogy with other mammalian patterns 

(e.g. human) through coelectrophorests 
Internal sequence 

Coelectrophorests with principal plasms 
protein 

A •Terminal sequence and AAA 
Internal sequence 
Internal sequence 
Ab (F. Wiumann) 

Uncertain; by comparison with mouse 
Analogy with human cellular patterns 

through coelectropboresis 
Ab (Lance Pohl) ■ 



871.95 5.36 

1857.52 6.51 

919.16 5.43 

763.40 5.19 

779.42 5.21 

I993J2 6.72 

1262.81 5.86 

1317.72 5.91 

1730.72 634 

1547.96 6.14 

665.33 5.01 

1996.60 6.72 

23.05 4.03 

310.59 4.34 



921.35 30 207 



822.52 

1313.81 

693.64 

692.26 

818.60 

445.64 

589.03 
756.02 
849.08 
397J9 

.1017.02 
1433.23 



34 406 

19 549 

41 586 

41 677 

34 593 

66 354 

49 602 
37 819 
33 186 
74 564 

26 887 
17 419 



433.80 68 206 
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Table 1. continued 



3-D DiUbtw of fit liver proteins 1979 



MSN 1 ' 


Protein IDb) 


Protein name 


Identification comments 


GelJT 


Ezpehmenui 


Gel r J 


Experimental 


1184, 1)86. 


CPSM.RAT 


Carbarn y) phosphate 


3*D of nun nmitifl' Minfinnfd hv 


1 453 J 6 




111 £4 


160 646 


114, 174. 118 


synthase 


f^-icrauaAi sequence ao« aaa 










5. 167, 157 














54. 61 


CATA.RAT 




loiciQii sequence 


ammm.% i 


<71 


AOO HA 


je Toa 


136 


COX2.RAT 


COX- 1 1 


no \j ■ • i iinnm j t com ii mcc d) 




4.01 


iuu.o/ 


7< <<Vi 
3U* 


















87 


CYB5.RAT 


Cytochrome B5 


2-D or pure protein; Ah; confirmed 


515.68 


A 71 


1170 


IS iOl 








by AAA 










41 


CK-RAT" 


Cytoke ratio 


Location in eytoskeleul fraction 


1165.12 


<7< 


<ao no 




29 


CK-RAT' 


Cyiokeratin 


Location in cytoskeletaJ fractioo 


743.11 






it it? 
*6 187 


5. 11 


ENPL-RAT' 


Endoplasmin 


Ah (7. Witzmann) 


567.73 


A B1 




1W 174 


60 


ENOa.RAT 


Enolase A 


Internal sequence and AAA 


1399.78 


6.00 


62334 


46 674 


27 


ER60.RAT 


ER-60 


A'-TenninaJ sequence (R. M. Van Frank) 


1 184 JO 


5.77 




vO.JQ7 


17 


ATPB.RAT 


Fl ATPise 6 


//-Terminal sequence and AAA 


629.06 


A ©< 

4.v^ 




47 UU 


196 


ATP7.RAT 


Fl ATPase 6 


Internal sequence 


I227J4 


< f 7 


1 154.0 J 


22 310 


79 


P16P.RAT 


Fructose- 1.6-bis-pfaospbause Uncertain; by comparison with ID in 


92434 


< At 


717 T7 

Hi. 4 i 


Jo Oj8 








Garrison and Wager (JBC 257:13135-13143) 










'62,78 


DHE3.RAT 


Gluumate dehydrogenase 


rV-Terminal sequence and internal sequence 1887J9 


635 


566.92 


51 655 


125 


HAST- RAT" 


HAST-1: N-bydroxyaryh 


Internal sequence 


1297.94 


5.89 










amiae sulfoiransf erase 












307 


HOI. RAT 


Heme oxygenase 1 


Uncertain; available data from internal 


1219J9 


5.81 


OH 71 


in i?i 








sequence 










413, 1250. 


HMCS.RAT 


HMG CoA synthase, 


Ah (J. Germenhiuten) 


1033.48 




tit 11 




933 




cytosolie 












133. 144. 235 


HMCS.RAT 


HMG CoA synthase. 


Ah (J. Germcrshausen), ^-terminal 


666.40 


5.02 


1010 1? 


7& XII 
*w Oil 






mitochondrial (frag) 


sequence (Steiner/Lottspeich) 










8. 23. 1307 


HS7C.ILM 


HSC-70 


Positional homology (with human, etc.) 


811.87 


5.27 


425.76 


AO {71 

D7 J^l 








through coelectropboresis 










15, 25. 110 


P60.RAT 


HSP-60 


Ab (F. Witzman); confirmed by AMerminaJ 


845.09 


5.32 


520.03 


56 561 








sequence and AAA 










971 


HS70-RAr» 


HSP-70 


Ab (F. Wimnan) 


976.11 


531 


437.14 


67 674 


1216, 1215. 90 HS^RAT" 


HSP-90 


Ah (F. Wiuman) 


659.86 


5.00 


329 


90 107 


256 


INGI-HUMAN 


IntcrferoD-T induced 


Internal sequence 


993.85 


534 


1006.04 


27 237 






protein 












415, 734 


LAMB-RAT' 


1 Jiniw B 


Positional homology with human through 


737.10 


5.14 


425.19 


69 615 




LAMR-RaT 1 




^electrophoresis, nuclear location 










80 


T-aminin receptor* 


Internal sequence 


534.02 


4.77 


697.62 


41 327 


227 


FABL.RAT 


L-FABP (liver fany »cid 


Ab (N. M. Bass) 


1586.09 


6.18 


1483.43 


16 J622 






binding protein) 










134 


MDHC.MOUS 
E 


MaJate dehydrogenase 


Internal sequence 


1270.85 


5.86 


861.96 


32 620 


18. 35, 226 


GR75-RAT 0 


Mitcon 3: grp75 


Positional homology with human through 


905.67 


5.4] 


413.67 


71 589 








coelectropborcsis 










175. 251 


NCPR.RAT 


NADPH P450 reductase 


2-D of pure protein 


824.69 


5.29 


393 J I 


75 366 


1168. 1170, 


PDI.RAT 


PDI: Protein disulfide 


//•Terminal sequence (R. M. van Frank), Ab 


564J0 


4.83 


528.47 


55 618 


1171 




isomense 












47. 93 


ALBU.RAT 


Pro-Albumin 


Microsomal lumen location, p/. M, relative 


1391.03 


5.99 


446.68 


66 195 








to albumin 










236 


APA1.RAT 


Pro-APO A-l lipoprotein 


Coelectrophoresis with plasma protein 


920.41 


5.43 


1137.51 


23 467 


320 


IPK1.BOVIN 


Proteio kinase C inhibitor 1 


Internal sequence; homology with bovine 


1480.01 


6.08 


1458.81 


17 007 


152 






protein 










PNPH.MOUSE 


Purine nucleoside 


Internal sequence 


1507.19 


6.10 


911.16 


30 599 






phospborylase 












1179, 1180, 


PYVC-RAT 1 


Pyruvate carboxylase 


Tentative; 2-D of pure protein (J. G. 


1485.10 


6.08 


22332 


131 589 


1181, 1182. 






Henslec, JBC 1979); reported in Biochim. 










1 183 






Btophys. Aete 1022, 115-125- 










55. 103 


SM30.RAT 


SMP-30: Scoescence 


Internal sequence 


721.71 


5.11 


830.10 


34 051 


135 




marker protein-30 












SODC.RAT 


Superoxide dismutase 


AAA; corafirrned by internal sequence 


1161.24 


5.74 


1388.68 


18 173 


172 


TPM-RAT" 




(R. M. Van Frank) 










Tm: tropomyosin 


Location in cytoskeleton, 2-D position 


476.24 


4.66 ( 


957.86 


28 865 


277, 56 


TBA1.RAT 




relative to human, Ab 










Tubulin o 


Positional homology with human through 


688.22 


5.06 


537.67 


54 620 


50. 1225 


TBB1JUX 




eoelecuopborests, cytoskeletaJ location 










Tubulin 0 


Positional homology with human through 


621.29 


4.93 


535.48 


54 155 


1224 


VIMEJWT 




coelectrophoresis, cytoskeletaJ location 










Vim en tin 


Posi tonal bosolon with human through 


673.00 


5.03 


53930 


54 426 








coelectrophoresis, eytoskeleul location 











R U 



1980 

laMc 1. continued 



MSN** 


Prolan IDb) 


rroirxa name 


Identification comments 


Ge! JT 


Experimental 


Gel r* 


Experimental 


ID 


Unknown 


?: not in sequence 


IntcmaJ sequence 


1191-28 


5.78 


610.42 


42 469 






databases 












104 


BBPL.RAT 


23 kDi morp bine -b tod) n| 


Intemii sequence 


773J1 


5.20 


1112.41 


22 363 






protein 













t) Mister spot number (MSN) from [1] 

b) SwissPROT identifier 

c) Coordinates of the most basic or most abundant assigned spot on the F344 muter gel pattern 

d) p/ and M t of the most basic or most abundant assigned spot, derived from the calibration functions included here 

e) SwissPROT style proposed identifier 
Abbreviations: AAA. amino acid analysis; Ab. antibody 



Table 2. Equations and coefficients 



Function 



Equation (0 



r2 



Rat gel Y - ficomputei \f.\ i ■ ■ c - *-xpf-xVc) 0.988181021 

Rat gel X « flcomputei pf) y « o - b x • a/lax - dlx «► e/x" 0.99247216 

Computed M, ■ Hrai gel D v • a «p bxc 0.9960177 

Computed p/ - flrat gelJT) > - • 0 + bx ~ ex 3 * dx 3 Inx * or 3 0.99176499 



Mouse gel Y - ftrat gel D 

Mouse gel X - ftm gel X) 
Rat gel X • H mouse gel X) 
Rat gel X « ftmouse gel X) 



y • * 0 + bx + ex*- 5 -r <*r° J lax * 

ex/Inx 0.99951069 

^lur+cx^ + dx 3 0.99926349 

y-o+bJlnx+a^ + dx* 0.99950032 

v - e - bx * cx 3 lnjr * dx" * ex 3 0.9992 S3 2 



178.74803 
-8685665-5 
-8464.5809 

4.044686 



11861.44 
58.935923 
69.740526 
-198.07189 



1967.7892 
-904497.94 

19095881 
-0.00114238 



678.91666 
0.00091353 
0.00050772 
2.0899063 



32363.958 
3856926.1 
-0.9086255 
0.0000323 



18276844 -27154534 
-0.00000455 0.00000000176 



-0.78964914 
-0.000213688 
-0.000130392 
-0.000671191 



15673639 
0.00000159 
0.00000116 
0.000145189 



-6953.9592 



-0.000000986 



y^+bx+cx/lnx+d/x+e/x A {1 .5) 




B 



y=a+bexp(-x/c) 




50000 100000 

computed MW 



y=a+bx+cx A 2lnx+dx A {2.5)+ex A 3 



y=a+bx A 2lnx+cx A (2.5)+dx A 3 




CO 

3 



1000 2000 
B6C3F1MST2.X 



3000 




1000 1500 
B6C3F1MST2.Y 



2500 



Figure 2. Plots showing His of selected equations (continuous curves) to data on identified proteins (square symbols). (A) p/ computed from 
sequence data vrrna gel X position Tor identified spots in F344 rat liver, (B) M, computed from sequence data verjus gel Y position for identified 
spots in F344 rat liver. (C) gel X position for spots in B6C3F1 mouse liver vernu X position in F3443 rat liver, for coelectrophoresing spots; (D) 
ge! f position for spots in B6C3F1 mouse liver *enus Y position in F3443 rat Ihrer, for coelectrophoresing spots. In each case, inverse equations 
were also computed (Table 2). 
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B6C3F1 MOUSE LIVER 2-D PROTEIN PATTERN 



J 



( B6QFlMSTl") 



' ! I i 



71 



^ Caftoamyl fhapbw Svtitfwat j 



• too 



•0— 



UsunB 



f ; — * 

* — *> 

i re 

* P- 



^ 10 — 

To 

1 - 

3 
u 
JO 



• 



« • iMMMtO y I 




y . 



Mitochondni Miootootu 
rcroiuuEMS Cytmkckton 
Nuclei Phnu CvwpI 



i i i ^ i i .. i j, i i i r-Ti i r 



Trrttatue 



I I t 



III! 




Pi 



F'i*'<J' Master 2-D gel pattern for B6C3F1 mouse liver, sundardized using the F344 rat liver pattern identifications, according to the method 
described in the text. Tuenty-nine proteins are identified 



uy-nine proteins are identified 

P^RAT PLASMA ~ AaTLIVE* X-»l (/raTPLaSMa^LTVER XHUT LIVER X 

(/RAT PLASMA X-RAT PLASMA *LrvtR X (^RAT PLASMa))) 

(8) 

This unified approach, in which one well-populated 2-D 
pattern is used to standardize a family of other patterns, 
has the additional advantage that the resulting pi and M t 
scales are directly compatible. Hence one can compare 
the relative pfs of mouse and rat versions of a se- 
quenced protein in a consistent pi measurement system, 
and select likely inter-species analogs based on posi- 
tional relationships on common scales. Adoption of 
immobilized pH gradient (IPG) technology [4-7] will 
result in substantial improvements in pi positional 
reproducibility for standard 2-D maps such as those pre- 
sented here; however, we believe that our approach will 
continue to be useful in establishing the empirical pH 
gradient actually achieved by such gels under given 
experimental conditions (temperature, urea concentra- 
tion, ere), in relating patterns, run on different IPG 
ranges and using different lots of IPG gels (between 
which some variation will persist); Development of 
rodent organ maps is a continuing effort in our laborato- 
ries [8-10], and results in regular additions of identified 
proteins. Those who wish to receive current rodent liver 
maps with color annotations, should send a stamped 
self-addressed envelope to the first author. 



We would like to thank the individuals who provided anti- 
bodies mentioned in Table 1, and R. M. van Frank for un- 
published sequenced data. 
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Introduction 

The advent of large genome sequencing projects has chanced the scale of w . 
Over a relatively short period of time, we have witnessed ,h, ti J 0, ° g - V - 
""^""cleodde^^ 

sequence of an eukarvo.ic chromosome (Oliver,/ at 1 9> ) and L t ^ ^ 
see the definition of all open reading frames ^n'*^ ? " ™ (mUTewUl 
M 77 , a s, r Escher ^ ~ .nc luding 

>><>'><t>'>selexansandArahiJo P sistl,al^ oennlcT """ 

are not an end in themsleves. In fact, the v onlv re™ n a S ^""^ 

There are two approaches that can be used to examine een P P,„ r „ • 
scale. One uses nucleic acid-based technolo*v ih, «h «P™Mon on a large 

The most promismg nucleic-acid ^S 0 £ ^rZZTl 
'Liang and Pardee. 1 992: Bauer et al J 99^5, .. d,s P |a y of mRNA 

* Corresponding Author 

- S.0.00 SO .00 O .n,crce P , Ud. P.O. Bo, 7,6. Awtew . Hlmpshire SP(0 , yc ^ 
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HEPG2 2D-PAGE MAP 



tilt. 



calreticutin 
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4~ • 

—*U^: Ibumin 
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*• i • * 



* • ' * • • 
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- tubulin 82 «*hydro9cnaW^>»*^ f » J^ ^ 



Apo A! 
TCTP 

ATP. 

synthase D 



\„ PGDH- 



glutathlon x MER5 
•[ S-transferase 
cyUdylale kinase 

transthyretin 

\ * 

fabp : ' - 



trlosephosphate 
Isomara^e 



PBP 



thioredoxln 



xln N 



HBB * 



•ATT> synthase CF6 ^ "blq 



cytochrome C oxydase VA 



ubiquitin 
cytochrome C oxydase VIA 
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~1 
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"T 



8.0 10 



Figure 1. Two-dimensional gel electrophoresis map of 3 human hcpainhlasioma-dcmcd cell line 
illustrating the very high resolution of the technique. The first dimensional separation mehi m Iclt of figure i 
was achieved using immohilised pH gradient electrophoresis of 4.0 to 10 0 units Thc'sccond dimension 
f top m hotiom of figure 1 wa* SDS-PAGE using a 1 * acnlam.de cradiem. aliowinc separation in 
me molecular weight range 1 0-250 kDa. Proteins were visualised h> s.hcr Mammy. Arrows show proteins 
of known idcntitv 



1992;Celise/tf/.. 1993: Garrels and Franza. 1989; VanBogclenr/a/., 1992). Current 
protocols can resolve two to three thousand proteins from a complex sample on a 
single eel {Figure 1). 



2-D GEL RESOLUTION AND REPRODUCIBILITY 

A primary challenge of separating complex mixtures of proteins by 2-D gel electro- 
phoresis has been to achieve high resolution and reproducibility. High resolution 
ensures that a maximum of protein species are separated, and high reproducibility is 



Progress w ith protcontc prowas 




Figure 2. Tuo-dimensional pel electrophoresis allows "zwminc in' on area* ot interest Rinc* hiehheht 
2 proteins common to each cel. ( A i Wide pi range two dimensional electrophoresis map of human plasma 
proteins First dimension separation was acheivcd using an immobilised pH gradient of ?.5 to 10 0 units. 
The second dimension was SDS-PAGE. Actual pel si/.e was 16cm x 2(>cm. and proteins were visualised 
with silver staining (Bi Narrow pi range electrophoresis was used to 'zoom in* on a small rceion of the 
plasma map. The first dtmcnsion used a narrow range immobilised pH gradient of 4.2 to 5.2 units, and 
second dimension was SDS-PAGE. Micropreparative loading was used, and the eel blotted to PVDF. 
Proteins were visualised with amido black. Actual blot size was 16cm x 2()wm. 

the use of piperazine diacrylyl as a gel crosslinker and the addition of thiosulfate in ihe 
catalyst system has been shown to give belter resolution and higher sensitivity 
detection (Hochstrasser and Merril, 1988; Hochstrasser, Patchornik and Merrii 
1988). 
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Table 1: O~*:^on siai.n for 2-D eels or hints and their application*. 



Detection Main Unsuitable Srn«iti\n\ Rcirrcncc^ 

Method application* application* 



i :, S| Met or "C 
radiolabcllmg and 
fiuorocrnph\ or 
phosphnnmuCing 

[ **S)thiourea silver 



SiKcr 



Conma.«iie blue 
R-250 



Colloidal cold 



Zinc imidazole 



Ponceau S and 
ami do hl;iwk 



India ink 



.Siamv.jll 



Cell lines, 
.uliured nrcanism* 



Extremely Inch 
*ensm\n\ eel 
MjinrnL' 

Vcr> Inch scnsi- 
i.vn\ eel Mjininc. 
can be mono or 
pol\ chromatic 

Staining of gels: 
staining of PVDF 
membranes he I ore 
protein sequencing 



Staining NC 
. membranes, 
staining PVDF 
before direct 
MALDJ-TOF 

Reverse staining 
of gels or mem- 
branes: may he 
beneficial in 
MALD1-T0F 
of peptides 

Staining higher 
protein load** on 
PVDF. for protein 
sequencing or amino 
acid anal) sis 

Staining of 
membrane-hound 
proictns: staining 
PVDF before direct 
MALDi-TOF 

Staining 10 detect 
glycoprotein* or 
Ca : * binding 
proteins 



Samples ihat 
can n 01 he labelled 



Preparative 2-D. 
PVDF or NC 
membranes 

Preparame 2-D. 
PVDF or NC 
membranes 

Staining prior to 
direct masc deter- 
mination from 
PVDF: amino acid 
analysis on PVDF; 
detection of some 
glycoproteins 

Gels 



Where positive 
image is required 



Staining prior to 
direct mass 
determination trom 
PVDF 

Gel staining, not 
quantitatne from 
protein to protein 



General eel siainini: 



20 ppm of 
radiolanei in 
a spot 

(I J ng protein 
on spot or band 
of gel 

J ng protein 
on spoi or 
hand of gel 

4(1 ng protein 
on band or 
spot of gc! 



GaneK an J Frjn/a. 
iviyy 

Latlum. Carre'* and 
Soiitr. \wy 

Wallace and Saluz. 
ivvij.b 

Rahilloud. I uu 2. 
Hochst raster ;nd 
Mcml. 

Strupat rtal.. IW4; 
Gharahdachi n aL. 

GolJhcrg rt at.. |s*XK; 
Sanchez rt a I.. IV92 



6(1 higher 
than 

coomassic 



Higher than 
C4»omassie 



MKJnc 
protein on 
hand or *poi 
of gel 

l-IOnc 



Yamacuchi and 
Asakawa. 1988: 
Eckcrskorn rt at.. 
IV92: 

Strupat rt at.. IW4 

Ortiz rt at.. I««2: 
James rt at.. 1993 



Sanchez rt a!.. 
Sirtipai r; ///.. |»iuj4; 
Wilkins rt at.. |y*>5. 



Li rial.. I4S9; 
Hugho. Mack and 
Hamparian. I^NK; 
Strupat rt at.. IVWJ 



UK) ng protein Cainphell. 

on band or MacLcnnan and 

spot ol gel Jorgcnscn. IVKV 

Goldberg rt at., ivxs 



P\ DF s pul> -\1n\l1Jrne dilltmndc. NC = niiroccllulmc. MALDI-TOF = matrix jxmmcJ u*cr tJcsnrpuoii loni^iimn nine 
<«: litem ma« *peciromeir\ . 

example, some glycoproteins are not stained by coomassie blue (Goldberg ct <//., 
1 988 ). and many organic dyes are unsuitable for protein detection on PVDF if samples 
are to be used for direct matrix-assiied laser desorption ionisation mass spectrometry 
(Strupat ct «/.. 1994). 

Although most means of protein detection give some indication of the quantities of 
protein present, in general they cannot be used for global quantitation. This is because 
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details of their po Hransiaiiorul modifications. 2-D sei databases are be-innt^ to • < 
linked to or integrated with comprehensive protein and nucleic a-fd T/l * 
(Neidhanh « ,/.. ,989: Simpson « ,/.. ,992: Appe, „ ^ and or ,antm- 
databases, containing DNA sequence data, chromosomal map locations reference^ 
D gels and protein functional information for an organism, are becoming established 
as genome and proieome proi;cts progress (VanBoeelen ct «/.. 199^ Y ea « Pm.,^ 
Database cited in Garrels e: ol.. 1 994 j. 0le,n 



GEL IMAGE ANALYSIS AND REFERENCE GELS 

After 2-D electrophoresis and protein visualisation by staininc. f1uoro*ranhv or 
phosphonmaging. images of gels are digitised for computer analvsis bv'an imaoe 
scanner, laser densnomer. or charge-coupled device (CCD) camera (Garrels 19X9 
CeUs er a!.. 1990a: Urwin and Jackson. 1993). All svstems di-iiise -cN with a 
resolution of 1 00 - 200 mm. and can detect a wide range of densities or shading , - « 6 
or more grey scales' ). Following this, gel images are subjected to a scries of mani 
pulanons t0 remove vertical and horizontal streaking and background haze to detect 
spot positions and boundaries, and to calculate spot intensity iFiyure ?> A standard 
spot (SSP) number, containing vertical and horizontal positional information is 
assigned to each detected spot and becomes the protein's reference number Tahlc "» 
lists some notable software packages which process 2-D °el imaaes. 



Table 2: Some Software Packages for ihe Analysis of Gel Imaces. 



Gel Image Analysis System References* 



GELLAB^i i n S UC ? M '" Cr - ' VK8: W, " h " aL ,w '^ v 'nh mi/., iw? 

1 ^ " * u - Lcmkln 3"^ Lipion. I W3: Lcmkin. \V U and Union iwv 

Myrick rt al.. IW. i • . - 

«5Jr A * IE 1 * " A Pncl- <">1 IWI: Hochstra>ser ,i ,,l I9s>lh 

QLESTI * l.andPDQUEST Garrels. 19X9: Monardo « „/.. IVW.Hnl.r,,,/ IW- Ccl.s ,•„,/ 

l*»9Ua.h " 111 ""' 

TVCHO * KEPLAR Anderson r, „/.. IVKJ. R.chardson. Horn and Anderson. iwj 



*\ Mem 



"hesc references are no. cxhausnve. the> .ndudc some references 01 use a, «cll ;,s author* o 



I the 



As there are difficulties in the electrophoresis of samples wiih 1 00* reproducibil- 
ity, reference gel images are often constructed from manv seK of the same s-.mnle 

.000 to 4000 proteins from one gel to another, it presents a considerable challenge to 
■ mage analysis systems. Matching of gels is usually initiated bv an opera.or who 
manually designates approximately 50 or so prominent spots as -"landmarks" on o cls 
to be cross-matched. Proteins which match are then established around landmark 

C ose ToTw tr' T° r a,g ° ri,hmS *° CX,end ,hC maiChin ^ ™ «he en," e I' 

although different degrees of operator intervention may be required (Olsen and Miller 
1988: Lemkin and Lester. 1 989: Garrels. 1 989: Myrick et «/.. ] 993 
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CALCULATION OF PROTEIN ISOELECTRIC POINT AND MOLECULAR WEIGHT 

Estimation of the isoelectric point ip!; and molecular weight (MWiof proteins from 
2-D eel - p;ovidcs fundamental parameters for each protein, which are also of use 
during identification procedures < see fallowing section). The pi and MW of proteins 
are recorded in 2-D gel dataha-es. Accurate estimations of protein pi and M\V can be 
obtained by using 20 or more known prjteins on a reference map to construct standard 
curve* of pi and molecular weight, which are then used to calculate estimated pi and 
MVV of unknown proteins (Neidhardt et aL. I9S9: Garrels and Franza. 19S9: Van- 
Bogelen. Hution and Neidhardt. 199C; Anderson and Anderson. 1991; Anderson et 
at.. 1991; Latham et aL. 1992). Alternatively, the MW of individual protein> blotted 
to PVDF can be determined very accurately by direct mass spectrometry (Eckcrskorn 
et aL. 1992 j. Where immobilised pH gradients are used, the focusing position of 
proteins allows their pi to be measured within 0.15 units of that calculated from the 
amino acid sequence (Bjellqvistr/n/.. 1993c). It must be noted, however, that proteins 
earning post-translational modifications may migrate to unexpected pi or MW 
positions during electrophoresis (Packer et aL. 1995). 

SPOT QUANTITATION AND EXPRESSION ANALYSIS 

A major challenge faced in proteomc projects is the quantitative analysis of proteins 
separated by 2-D electrophoresis. The most accurate means of protein quantitation is 
to determine chemically the amount of each protein present by amino acid com- 
positional analysis. However, the current method of choice for quantitative analysis 
of many proteins is to radiolabel samples with ( : *S) methionine or U C amino acids, 
perform the 2-D electrophoresis, and measure protein levels in disintegrations per 
minute (dpm) or units of optical density. Quantitation is achieved either by liquid 
scintillation counting, or by gel image analysis where spot densities are quantified 
h\ reference lo gel calibration strips containing known amounts of radiolabeled 
protein or against the integrated optical density of al! spots visualised ( Vandekerkhove 
et aL. 1990; Celis et aL. 1990b: Celis and Olsen. 1994; Garrels. 1989: Latham. 
Garrels and Solter. 1993: Fey et aL. 1994). All approaches effectively allow spots to 
he normalised against the total disintegrations per minute loaded onto the cel. 
Limitations that remain with radiolabelling methods are that absolute quantitation is 
not achieved because all proteins have varying amounts of any amino acid, and that 
only easily labelled samples can be investigated. Quantitative silver stainine presents 
an alternative (Giometti et aL. 1991; Harrington et aL. 1992: Rodriguez et aL 1993; 
Myrick et aL. 1993). which when undenaken with ("SJthiourca (Wallace and Saluz. 
1992 a.b) is of extremely high sensitivity. 

When protein spots from samples prepared under different conditions arc quantified 
and matched from gel to gel. it becomes possible to examine changes and patterns in 
protein expression. Large scale investigation of up- and down-regulation of proteins, 
their appearance and disappearance, can be undenaken. For example, simian vims 40 
transformed human keratinocytes were shown to have 177 up-regulated and 58 down- 
regulated proteins compared to normal keratinocytes (Celis and Olsen. 1 994 j; detailed 
synthesis profiles of 1 200 proteins have been established in 1 to 4 cell mouse embrvos 
(Latham et aL. 1991, 1992); and 4 proteins out of 197] were found to be markers for 
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FEATURES OF PROTEOME DATABASES 

Proteome projects rely heavily on computer databases to store information about all 
proteins expressed by an organism. 'Proteome databases* should contain detailed 
information of proi-in> already characterised elsewhere. a> well as protein data from 
2-D gels such as apparent pi and MW. expression level under different conditions, 
subcellular localisation, anc* information on post-translational modifications. Images 
of reference 2-D gels, sScwins protein SSP numbers and protein identifications, 
should also be included. Ideally, proteome databases should be accessible with 
Macintosh or IBM persona! computers and easy 10 use. Some proteome databases and 
the areas they cover are l : Med in Table 3. Databases range from collections of 
annotated gels to large databases of images integrated with protein and nucleic acid 
sequence banks. 

One example of an integrated proteome database is the suite of SWISS-PROT. 
S WISS-2DPAGE and S WISS-3DIMAGE databases ( Appel a <//.. 1 993; Appel a <//.. 
1994; Appel. Bairoch and Hochstrasser. 1994; Bairoch and Boecknumn. 1994). The 
features of these three databases are listed in Table SWISS-PROT. SW1SS- 
2DPAGE and SWISS-3DIMAGE are accessible through the World Wide Web 



Table 4: The SWISS-PROT. SWISS-ZDPAGE and SWISS-3DIMACE suite ol crovslmkcd dauha> 
All three databases are accessible through ihc World Wide Web. at URL add revs : hup:// 
expas\ hcuge.ch/ 
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alternative to traditional approaches ( Tuhlc-5: Wasinger ?/<//.. 1 995 ). This involves the 
use of rapid and cheap identification tools such as amino acid analysis and peptide 
mass fingerprinting as first steps in protein identification, followed bv the use of 
slower, more expensive and time consuming identification procedures if ncccsstrv In 
the construction of this hierarchy the analysis time, cost per sample and the complcxitv 
of the data created has been considered, as whilst some techniques require little 
machine time per sample, the analysis of data can be quae involved and time 
consuming. Ammo acid analysis and peptide mass-fingerprintinc based identificition 
techniques m the hierarchy are discussed in detail below. For rcv.cw of other protein 
identification techniques in Table 5. see Patterson < 1994) and Mann < IW5i. 

PROTEIN IDENTIFICATION BV AMINO ACID COMPOSITION 

There has been a revival of interest in the use of amino acid composition for 
idenuficat.on of proteins from 2-D gels after early work by Eckerskorn a al < I98X) 
This techn.que uses a protein's idiosyncratic amino acid composition profile in order 
io identify it by comparison with theoretical compositions of proteins in databases 
The ammo acid composition of proteins can be determined bv differential metabolic 
radio labelling and quantitative autoradiography after 2-D electrophoresis (Carrels ,/ 
al 1 994: Frey „ „/.. , 994 ). or by acid hydrolysis of membrane-blotted proteins and 
chromatographic analysis of the resulting amino acid mixture .Eckerskorn a al 
1988:Tous*m/.. 1989: Gharahdaghi^*/.. 1992: Junsblutrr,,/ I99">- Wilkinse/*//" 
199_u As deferential metabolic labelling experiments require X-rav film or phos- 

P ,T?, e , P eXP ° SUreS ° f UP 10 ,40dayS - and Can ° n, - v bc undertaken with easily 
radiolabeled samples, the technique is not as rapid or widely applicable as chrornato- 
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Fipure 5. A PVDF protein <poi from an £ <■##/# 2-D relerencc map uy* sequenced lor J c\clcv and the 
same sample then subject in amino acid analysis. The N- terminal sequence was M L K R When the amino 
aw id composition of ihe spnt, as well a> estimated pi and MW. were mulched airainM ail cnirn> in SW1SS- 
PROT lor£. rr#/;. the above list of hesi matches u as produced. N-icrminal sequence* arc from SXVJSS-PROT 
for those entries The top ranking identification of serine hydroxyiiKMhyhransicrase < hold i did not xh<m a 
larcc score dillcrcncc between the I'irsi and second ranking proteins. ci\in»: little conf idence in thi* hcin«« 
ihe corrcci protein identification However, the sequence lac cM LK Ri confirmed ihe idcnuu m the 
proicin as serine hydroxymcth) Itranslcrasc. 



tryptophan are destroyed during hydrolysis, asparacinc and dutaminc arc dcamidaied 
10 their corresponding acids, and proline is not quantilatcd in some analysis systems. 
The computer programs produce a list of best matching proteins, which are ranked hv 
a score thai indicates the match quality. Some programs allow matchinc to he 
restricted to specific windows' of MW and pi (Hobohm. Houthaeve and Sander. 
1994; Wilkin^ et al.. 1995). and to protein database entries for one species (Junublui 
cut!,. 1992: Wilkim *■/<//.. 1995 ). The use of such restrictions increases the power of 
matching. An example of protein identification by amino acid composition is shown 
in Figure 4. To date, amino acid composition ha*, been used to identify proteins from 
reference maps of Spiwp/astna nwllifcrunu Mycoplasma \>ciuialium. E. volt. Saccha- 
nmiyces cercvisiac. Diayosteliuiu discoitieum. human sera, human. heart, human 
lymphocyte, and mouse brain (Cordwell et aL. 1995: Wasingcr ct at.. 1995: Wilkins 
etaL 1995: Jungblui etaL. 1992. 1994: Garrels et aL 1994: Frevr/*//.. 1994). 



PROTEIN IDENTIFICATION BY AMINO ACID COMPOSITION AND N-TERMINAL 
SEQUENCE TAC 

When samples from 2-D gels are not unambiguously identified by amino acid 
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( Nikodem and Fresco. 1 979: Crimmins et aL I 990; Vanfleteren et al 1 99^, 

After protein* are digested, peptide masses are detemined'bv mass '^xiroiutin 
Direct analysis of pepi-de mixtures can he achieved by electrosprav ionisation mas, 
spectrometry plasma d> -sorption mass spectrometry or matrix assisted laser dc sorption 
ionization (MALDI > m iss spectrometry techniques. MALDI is preferable because ol 
its higher sensitivity anJ greater tolerance to contaminatinc substances from *cN 
(James et aL 1993; Mc rtz et aL 1994; Pappin. Hojrup and Bleasbv. 1993). further- 
more, recent modification* to sample preparation methods have larcclv solved eirh 
difficulties experienced with the calibration of MALDI spectra (Monz et al 1994' 
Vorm and Mann. J99-J; Vorm. Roepsiorff and Mann. 1994). The hish sensitivitv of 
mas* spectrometry allocs a small fraction of a digest of a 1 u c protein spot to be used 
for analysis, and analysis itself is complete in a feu minutes. 

A major challenge associated with peptide mass fingerprinting is data interpretation 
prior to computer matching against libraries of theoretical peptide dicetts Spectra 
must be examined carefully to determine which peaks represent peptide masses of 
interest. as there are often enzyme autodigestion products and contamination sub 
stances present (Henzel et aL 1993: Monz et aL 1994; Rasmussen et al 1994, 
Furthermore, if protein alkylation and reduction has not been undertaken prior to 
protein digestion, peptide sequence coverage may be poor (40* to 70* ) with some 
masses present representing disulfide bonded peptides originally present in the protein 
( Monz et aL 1 994 ). For eukaryotes. a serious issue is the alteration of peptide masses 
by the presence of post-translational modifications (Table 6). The mass of the 
unmodified peptide alone can be very difficult to determine. Two artifactual modifi 
cations introduced by electrophoresis, an acrvlamide adduct to cvsieine and the 
oxidation of methionine, are also known lo alter peptide masses < le Maire et al 1 go v 
Hess ct al.. 1993). *' ' 



Table 6: Ma<sc^ of <omc common pntt-iranslatmnal mcxlincaiifms. Peptide* carrv.n- pom 
iranslnuonal mod,! .canons complicate data analysis for pcpndc ma*s rinccr^iniin-pmu-in 
.^niHu-anon This opcc.all;. so lor protein clycos> iation. uluch ,nw.|v C s mam d.n-r-m 
.ornhinai.Mns ol the hexmamines. hexoscs. deoxvhexoscs. and sialic acid 
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hypothetical protein 1 - A2otobacter vinelandii 
CHLOROPLAST HEAT SHOCK PROTEIN PRECURSOR. - PISUM SATIVU 
Tropomyosin - African clawed frog 

HIWI354 premature term, at 793 - Human immunodeficiency 
TRAJ PROTEIN. - ESCHERICHIA COLI . 

Fifiure 6. Theoretical cross-species matching of human 3 P i»l.pciprnic,n A-I hv amino acid composing 
and trypuc penudes. When an unkno* n protein is analysed, nest ranking protend Irnm both technics can 
compared. If the same protein type ,s observed in both lists, there is h.eh confidence in this heme the 
identity of the unknown molecule (Cordwell etaL IW5i. (A) Output of ExPASv server (Appel Batmch 
and H ^ h ^'l^ imc ammo actd composition of apol.poprote.n A-I was matched aeainst 

all entries ,n the SWISS-PROT database, without pi or MW wndows. Seven »f the .op 10 matching 
proteins ucre apohpoprotein A-I of different species. (B» Output of MOWSE peptide mas T.neemr.ni „! 
Program rPappin^ 
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The status of proteome projects 

Many technical aspects of proteome research have already been discussed in this 
review, but an overview of the status of proteome projects has not yet been presented. 
Advances in proteome projects will initially rely on progress in genome sequencing 
initiatives, to enable an identity, amino acid sequence, or function to be assigned to 
each proiein spot. Table 7 shows genome size, proteome size, and :he number of 
proteins already defined for a number of model organisms. This indicates that whilst 
genome sequencing programs for E. coli and S. cerevisiae are advanced, the massive 
size of ome other genomes (and especially the human genome; means that their 
compler nucleotide sequences are unlikely to be available for many years. Because of 
this. 2-D leference maps and proteome projects of single cell organisms like Myco- 
plasma sp.. E. coli and S. cerevisiae will be the most detailed (Cordwell ei al.. 1995- 
Wasinger et aL 1995: Vanbogelen et aL 1992: Garrels et al.. 1994). and complete 
maps of other organisms will take longer to construct. However, the use of cross- 
species protein identification techniques will allow proteomes of manv prokaryoies 
and simple eukaryotes to be partially defined in reference to E. coli and 5. cerevisiae. 

Table 7: Estimated genome size, estimated proteome size, number or protein sequences in SWISS- 
PROT Release 31 (March. 1 995 1. and approximate number of proteins i«T known idcmm On ZD 
reference maps for some model organisms. Genome size data from Smith ( lysU). and total protctn data 
1996 B ' rd "" 5l Gcn " me se 9 uen;:,n F projects of £ col, and S. cerevisiae uill probablv be complete in 

Species Name Haplotd Estimated protein Proteins 

genomcsSize proteome size entries in annotated on 

(million bpi t total proteins) SWISS PROT 2-D Maps 

HX> > I (X) 

3 HO >VX) 

3 1 M> > kk) 

2(U 

~"3 

-'-'-fr > MXK) 

The study of vertebrate proteomes and vertebrate development is a phenomenal 
undertaking in comparison to the investigation of single cell orcanism.s. This is 
because vast numbers of proteins are developmental!)- expressed, each bodv tissue has 
hundreds of unique proteins, and there are numerous tissue tvpes. However it is 
estimated that at least 35* of proteins in vertebrate cells will be conserved from tissue 
to tissue, constituting the 'housekeeping- proteins ( Bird. 1 995 ). with the remainder of 
proteins constituting a set that are specific to a cell type. Providing that standardised 
electrophoreiic conditions are used, reference maps from many tissues of one organ- 
ism can be superimposed in gel databases (e.g. Hochstrasser ct al.. 1992) This 
accelerates the definition of the -housekeeping' proteins, as well as sets of proteins thai 
are unique to different tissue types. Such studies may. however, be complicated by 
post-translat.onal modifications, which can differ on the same gene product in 
different tissues. Proteins that remain unknown after identification procedures will be 
useful in providing focus for nucleic acid sequencing initiatives. 
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This review has described recent advances in the area of proieome research. Ii has 
illustrated hou new development of older techniques t2-D electrophoresis. ;»nd amino 
acid analysis ) as well as the applications oi new technology { mass spectrometry ) have 
greatly widened the choice of tools the biologist and protein chemist ha> "for the 
separation, identification and analysis of complex mixtures of proteins. Thi* has made 
possible the establishment of detailed reference maps for organisms, which are 
becoming the method of choice for the definition of tissue* or whole cells, and the 
investigation of gene expression therein. 

Proteome projects are already impacting on the dogma of molecular biolocy that 
DNA sequence constitutes the definition oi an organism. For example, the profeomes ArT 
of different tissues of a single organism are often significantly different. Similarlv. 
cross-species identification of proteins (for example the identification of proteins 

from Candida albicans by comparison with S. cerevisiae) can open up studies on " 
organisms that are poorly molecularly defined. As cross-species identification can 

proceed at a pace orders of magnitude faster than a genome project in terms of A|T! 
defining the gene and protein complement of organims. the need for the DNA 
sequencing of genomes will be avoided, and emphasis.placed on those found to be 

novel. Bul * 
Just as genome sequencing is not an end in itself, neither is an annotated 2-D protein B-\k 
reference map of an organism, nor indeed the identification of proteins in a proteome 
So whilst an immediate aim of proieome projects is to screen proteins in reference 

maps, this will lead to expression studies and characterisation of posi-iranslational BaR 
modifications. The challenge that then needs to be addressed is the investigation of 

structure and function of proteins in a proteome. The magnitude of this is illustrated by Bah 
the fact that over half the open reading frames identified^ 5. cerevisiae chromosome 
III were initially of no known function (Olivers a!.. 1992). Structural and functional • 

studies will be an undertaking just as formidable as genome studies are now and BE * 
proteome projects are becoming, but will lead to an unimaginably detailed under- Bmr 
standing of how living organisms are constructed and how thev operate. ' 
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Rapid cross-species identification of proteins f rom 2-D reference maps can be 
undertaken with amino acid composition or pepiide mass fingerprinting method 
i Figure 6). but these 'echniques alone ma> not identify proteins unambiguous! v when 
phylogenetic cross-sp C wies distances are g'eat or analysis data is of poor quality ( Yntr s 
et aL. 1993: Shaw. 1993: Cordwell ei aL. 1995). However, very high confidence in 
protein identities can be achieved when lists of best-matching proteins generated bv 
both techniques are compared (Cordwell et aL. 1995: Wasinger et aL. 1995). The 
correct identification is found when the same protein is ranked highly in list> of best 
matches generated by both techniques. This method has allowed approximately 120 
proteins from the reference map of the mollicutc Spiroplasma melliferum. represent- 
ins approximately one quarter of the proteomc. to be confidently identified by 
reference to protein information from other species iS. Cordwell. Personal Communi- 
cation). When cross-species protein identification is to be undertaken, it should be 
noted that the molecular weight of a protein type across species is usually highly 
conserved, but that protein pi can van* by more than 2 units (Cordwell ei aL. 1995). 
Accurate molecular weight determination by direct mass spectrometry of proteins 
blotted to PVDF (Eckerskorn et aL. 1992) should therefore be a useful additional 
parameter for cross-species protein identification. 

CHARACTERISATION OF POST-TRANSLATIONAL MODIFICATIONS 

Many proteins are modified after translation. Such post-translational modifications, 
including glycosylation. phosphorylation, and sulfation (see Table 6). are usually 
necessary for protein function or stability. Some abnormal modifications are associ- 
ated with disease (Duthel and Revol. 1993: Ghosh et aL. 1993: Yamashita et aL. 
1993). In proteome studies, post-translational modifications can be examined on all 
proteins present, or on individual spots. Studies on all proteins provide an indication 
of which proteins may carry a certain type of modification. For example. 2-D gel 
analysis of cell cultures grown in the presence of [*H] mannose or ["PI phosphate 
gives an indication of which proteins carry glycans containing mannose. and which 
proteins are phosphoryiated (Garrelsand Franza. 1989). Lectin binding studies of 2-D 
eels blotted to PVDF or nitrocellulose provide information on the saccharides, if any. 
that are carried by proteins present (Gravel et aL. 1994). 

When individual proteins of interest carrying post-translational modifications have 
been found, micropreparative 2-D electrophoresis can be used to purify them in 
microgram quantities (Hanash et aL. 1991: Bjellqvist et aL. 1993b). If protein 
informs of similar MW and pi are to be studied, focusing with narrow range pi 
gradients (I pH unit) can provide greater separation and resolution. After electro- 
phoresis, the type and degree of protein phosphorylation can be investigated iMunhy 
and Iqbal. 1991: Gold et aL. 1994). monosaccharide composition can be determined 
i Weitzhandler et aL. 1993: Packer et aL. 1995). and the structure and exact site of 
glycoamino acids can be investigated by either Edman degradation based techniques 
or by mass spectrometry tPisano et aL. 1993: Huberty et aL. 1993: Carr. Huddleston 
and Bean. 1993). With further development of rapid techniques, investigation of 
phosphory lation and monosaccharides by chromatographic or mass spectrometry 
means is likely to become a routine step in the characterisation of post-translational 
modifications of proteins from reference maps. 
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A number of computer programs are available for matching peptide masses acainM 
databases i reviewed in CottrelL 1994). Matching is usually undertaken in an interac- A x 
live manner, whereby peaks of mass 500-5000 Da are selected and matched under 

• various search parameters including MW of protein, mass accuracy of peptides, and [[' 
number of missed enzyme cleavages allowed <Henz»»l ct aL. 1 993; Monz ct aL. 1 994; 

Rasmussen et aL. 1994 ). The correct protein identity is the protein which has the most t:' 
peptide masses in common with the unknown sample. Identities have been established tv- 
with as few a* three peptides, but unambiguous idemificaticn is thought to require a „ :le 

mass spectrometry map covering most peptides of the protein (Monz ct aL. ]gga; 

Vates et aL. 1993). To date, peptide mass fingerprinting of proteins has been *I 

undertaken from the human myocardial protein and keratinocyte maps, from an£. coti 

:-D gel. and from reference maps of Spimplastna mellttvmm and Mycoplasma 

yenitaliumiSuuonetaL. 1995; Rasmussen etaL. 1994; Henzel eta!.. 1993; Cordwell 

et aL. 1995. Wasinger et aL. 1995). although the technique is most powerful when 

used in combination with another protein identification technique i Rasmussen ct aL. 

1994: Cordwell et aL. 1995). 



MASS SPECTROMETRY SEQUENCE TAGGING 



An extension of peptide mass fingerprinting has recently been described, called 
peptide sequence tagging (Mann and Wilm. 1994; Mann." 1995). This uses tandem 
mass spectrometry (MS/MS ) to initially determine the mass of peptides, then subject 
them to fragmentation by collision with a gas. and finally determine the mass of 
fragments. The resulting spectra gives information about a peptide's amino acid 
sequence. The fragmentation masses of peptides can rarely be used to assicn a complete 
sequence, but it usually allows a shon "sequence tag* of 2 or 3 amino acids to he 
determined. This sequence tag and the original peptide mass is matched bv computer 
against a database, providing a likely identity of the peptide and the protein it came from. 
The major drawback for this technique as a mass screening tool is the complexity of the 
mass data generated and the high level of expertise required for its interpretation. 
Nevertheless, it represents a useful new protein identification method which ureal ly 
increases the power of peptide mass fingerprinting protein identification. 
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Cross-species protein identification 

Protein sequence databases continue to grow at a rapid rate, yet it is not widclv 
appreciated that close io909r of all information contained in current protein databases 
comes from only 10 species (A. Bairoch. Pers. Comm. ). Fortunately, this information 
can be used to study proteomes of organisms that arc poorK defined at the molecular 
level, via 2-D electrophoresis and 'cross-species' protein identification (Cordwell ct 
al... 1 995: Wasinger et aL. 1 995 i. This approach allows proteins from reference maps 
of many different species to be identified without the need for the correspondim: s:ene^ 
to be cloned and sequenced. This is particularly true for 'housekeeping' proteins, such 
a^ enzymes involved in glycolysis. DNA manipulation and protein manufacture, 
which are highly conserved across species boundaries. Proteins that cannot be 
identified across species boundaries can then become the focus of further protein 
characterisation and DNA sequencing efforts. 
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c .mposition. pi aivl MW. ofien the corTeci ideniificau™ «*f thai nrote.n , „, 
«-*™king.of«hel^ 

l,ki - " 14951 ™»S ^vantage of observation. Ae h:u , ^ f ^ 
*nectron,cir> sequence tag' cciccpi.sMann and Wil n. 1994, In develops a com . 

.\\ I lk I n.r/ < ,/..s U hnm I ed..Th,>mvolvestheN-,erm.n,|vequeouncolpVDF-No.icd 
proteins by Edman degradation for 3 or 4 cycles to create a sequence ta^ following 
w h.ch the same sample . s used for amino acid analysis. As onlv a feu ammo acids are 
..-moved from the prote.n. .ts composition is not significant!- altered Furthermore 
since onl> a .mall amouni of protein sequence is required. fa.t hut low repetitive x .eld 
I Jman degradation cycles can be used. Modifications to current procedures should 
allow 3 cycles to be completed in I h. thereby allowing the screening of 100 or more 
proteins per week on one automated, multi-cartridge sequenator. Ammo and amino 
s.tion. pi and Mtt of proteins are matched against databases as described above and 
Vterminal sequences oi best matching proteins are checked with the sequence r,o- 
to confirm the protein identity , Figure 5». This technique will be less useful when 
proteins are N-terminally blocked, but as only a few N-tcrminal ammo acids -.rc 
susceptible to the acetyl, formyl. or pyroglutamyl modifications that cause blockade 
thi> may itself provide useful information for sequence tas identification A strength 
of N-term,nal sequence lag and ammo acid composition protein identification is trnt 
data generated are quickly and easily interpreted. 



PROTEIN IDENTIFICATION BY PEPTIDE MASS FINGERPRINTING 

Techniques for the identification of proteins by peptide mass fincemriniin- have 
recently been described (Henzel cr al.. 1993: Pappin. Hojrup and" Bleasbv '~ 199V 
James « al 1993: Mann. Hojrup and Roeps.orff. 1993: Va.es ,, „/.. 199 V \ Um7 ]' 
aL 1994: Sutton ,,,,/.. .995,. This involves the generation of peptides from protein " 
using res.duc-speafic enzymes, the determination of pept.de masses, and (he match- 
ing of these masses against theoret.cal peptide libraries -encraied from protein 
sequence databases. As protems have different amino acid sequences. ,he,r peptides 
should produce characteristic -fingerprints - . 1 ' 

The first step u! pept.de mass fingerprinting is protein dii:es,,on. Proie.ns u „h.n the 
gei maim or hound u, PVDF can be enzynu.^ ^ 

digests arc reported to produce more enzyme autod.ges.ion products, which compli- 
cate subsequent peptide mass analysis , James cr al.. 1993: Rasmussen cr al 1994 
Mow :c, a,.. 1994,. The enzyme of choice for digestion ,s current.v tnp's.n (of 
modified sequencing grade ). but other enzymes ( Lys-C or5. aureus V8 protease , have 
also been used .Papp.n. Ho.rup and Bleasby. 1993). To maximise the numb-r of 
peptides obtained, it is desirable for protein samples to be reduced a^d alkvla.ed prior 
.o digesnon <Mortz a al.. 1994: Henzel „/.. 1993,. This ensures that all disulfide 
bonds of the protein are broken, and produces protem conformation, that are more 
amenable to digestion. Surprisingly, chemical digestion methods such as cvano-en 
brom.de .methionine specific, formic acid (aspanic acid specific, and ^-r- 
nitrophenylsulfenylK^methyl.r-bromoindolenine (tryptophan specific, have" nm 
been exp ,or ed as means of peptide production for mass fingerprintinc. even though 
tnev are rapid and may circumvent some problems associated with enzvme di-estions 
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Figure 4. Compuicr printout from ExPASy server where the empirical amine* acid composition, 
estimated pi and MW of a prolan Irom a 2-D reference mnp of £ vnli were matched airainst all entries in 
SWISS PROT for£ i fii The correct identification, aspartate carhamox hran*lera«c. Omun in hold. Low 
vcore* indicate a eood match Note how matching within a defined pi and MW ranee i lower *ei ol proteins » 
ha* irreailv increased the score difference between the first and second ranking protein* Thr> score 
diliercncc give* hiph confidence in the idem ificat ion. and is onh obscned where the top ranking protein 
k the correct identification i Wilkins ci «/.. I uu 5i. 

graphy -based analysis. Proteins blotted to PVDF membranes can be hydrolysed in I h 
at 1 55 ~C. amino acids extracted in a single brief step, and each sample automatically 
derivatised and separated by chromatography in under 40 minutes (Wilkins ct aL. 
1995; OuetaL. 1995). In this manner, one operator can routinely analyse 100 proteins 
per week on one HPLC unit. This technology lends itself to automation, and it is 
anticipated that instruments with even greater sample throughput will be developed. 
When proteins have been prepared by micropreparative 2-D electrophoresis ( Hanash 
ct aL. 1991: Bjellqvist et aL. 1993b). blotted to a PVDF membrane and stained with 
amido black, any visible protein spot is of sufficient quantity lor amino acid analysis 
(Cordwell et aL. 1995; Wasinger et uL. 1995: Wilkins et aL. 1995). 

After the amino acid composition of a protein has been determined, computer 
programs are used to match it against the calculated compositions of proteins in 
databases (Eckerskom et aL. 1988: Sibbald. Sommerfeldl and Argos. 1991: Jungblut 
et aL. 1992: Shaw. 1993: Hobohm. Houthaeve and Sander. 1994; Wilkins et aL. 
1995). Matching is usually done with only 15 or 16 amino acids, as cysteine and 



T,r 
Z le . 

* 

Mw e 

C les 
rar.r 

Rsr.V. 

= :sc 
1 



6 
c 

1C 



Figure : 

same sat 
acid con 
PROTIt 
I or tluK% 
large sc* 
the corr 
protein . 

trypto; 
to thci 
The ci 
a scot 
rcstrit 
1994; 

et «/.. 

match 
in Fi.e 
rcferc 
mum 
lympl 
et aL. 



PROT: 

SEQU 
Whci 



MaRC R. Wilkins ct at. 



(Bemers-Lee et aL 1992 ). allowing any computer connected to the internet to acc^ 
the Mored information and images. Navigation withir an J Dem eer the three databas ^ 
is seamless, as all potential crosslinks are highlighted a< hvpenexi on the dispiav and 
car be selected with a computer mouse. From these databases, detailed information 
abc ut a protein, including amino acid sequence and known posi-rranslationai modifi- 
cations, can be obtained, the precise protein spot it corresponds 10 on a reference eel 
im;ge can be viewed if known, and the 3-D structure of the molecule can be seen if 
available. References to nucleic acid and other databases are also given to provide 
access to information stored elsewhere. 

Organism* databases, containing detailed protein and nucL-ic acid information 
abMijt a species, arc becoming common as genome and proteome projects process. 
The>e differ from nucleic acid or protein sequence databases like GenBank or SWISS- 
PROT because they are image based, and contain information about chromosomal 
map positions, transcription of genes, and protein expression patterns. The Es- 
cherichia coli gene-protein database (VanBogelen. Hutton and Neidhardt. 1990: 



VanBogelen and Neidhardt. 1991. VanBoselen et al. 



1992). known as the 



EC02DBASE. is one example. It contains gene and protein names. 2-D gel spot 
information (including pi and MW estimates, and spot identification), cenetk* infor- 
mation (GenBank or EMBL codes, chromosomal location, location on Kohara clones 
(Kohara. Akiyama. and Isono. 1987). transcription direction of genes), and protein 
regulatory information (level of protein expression under different growth recimes. 
member of regulon or stimulon). All entries in the EC02DBASE are also cross, 
referenced to the SWISS-PROT database (Bairoch and Boeckmann. 1994). h is 
anticipated that organism databases will soon become a standard means of storinc all 
available information about a particular species. However there is currentlv no 
consistent manner in which organism databases are assembled, which ma> hamper 
comparisons in the future. 

Identification and characterisation of proteins from 2-D gels 

The number of proteins identified on a 2-D reference map determines its usefulness as 
a research and reference tool. As most reference maps have only a small proportion of 
proteins identified, a major aim of current proteome projects is to screen manv proteins 
from 2-D maps, in order to define them as 'known* in current nucleic acid and protein 
databases, or as unknown*. Protein identification assists in confirmation of DNA 
open reading frames, and provides focus for DNA sequencing projects and protein 
characterisation efforts by pointing to proteins that are novei. Since there may be 
3000-4000 proteins from a single 2-D map that require identification, the challenge in 
protein screening is to identify proteins quickly, with a minimum of cost and effort. 

Traditionally, proteins from 2-D gels have been identified by techniques such as 
immunoblotting. N-terminal microsequencing. internal peptide sequencing, 
comigration of unknown proteins with known proteins, or by overexpressjon of 
homologous genes of interest in the organism under study < Matsudaira. 1 9S7: Roscnfeld 
a aL. 1992; VanBogelen et aL. 1992; Celis et aL. 1993; Honore et aL. 1993; Garrcls 
er aL. 1994). Whilst these techniques arc powerful identification tools, they are too 
expensive or time and labour intensive to use in mass screening procrams. A 
hierarchical approach to mass protein identification has been recently suggested as an 
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cadmium icr.iciiy in urinary proieins (Myrick ei al.. 1993). Complex slot al chanco 
in protein expression as a resuh of gene disruptions have also been investigated i S. Fey 
and P. Mov -Lar>en. Personal communication). Impressively, large gel sets showing 
proiein expression under different conditions can be global!} investigated using 
statical n ethocK thai find groups of related objects within a set. For example, the 
REF52 rat ell line database, consisting of 79 gels from 12 experimental groups where 
each gel contains quantitative data for 1 600 cross-matched protein*, has been analysed 
by cluMer analysis (Garreis ei aL. 1990). This revealed clusters of protein> that, for 
example, v ere induced or repressed similarly under simian virus 40 ov adenovirus 
transformation, suggesting a common mechanism. Proiein groups that w ere induced 
or repressed during culture growth to confluence were also found. It is ob\ ions that the 
potential for investigation of cellular control mechanisms by these approaches is 
immense. It is equally clear that investigations of gene expression of this scale are 
currently technically impossible using nucleic-acid based techniques. 

Table 3: Some proicomc database* and their special features 



PnMciwnc database 



Special features 



References 



£. (nh L*cnc- protein database 



Human heart databases 



Hum:m keratinocvtc database 



Mi'u^: embryo database 



Mou^c liver datahase 
( Anjonne Protein 
Mapping Group) 

R.it 1 1 \ cr epithelial database 
Rji Incr database 



REF 52 rjt cell line database 



SWISS-ZDPAGE containing 
human reference maps 



Vcasi Proiein Database (YPDi 
and Ycasi Elcctrophoreiic 
Protein Database (YEPD> 



Gci spoi> linked with GcnBank 
and Kohara clones: quantitative 
spot measurements under differ- 
ent growth conditions 

Identification of disease markers: 
two separate databases have 
been established 

Extensive identifications: 
quantitative spot measurements 
of transformed cells, idcntiflca- 
non ol disease markers 

Quantitative spot 
measurement through 
I to 4 cell stage 
Documents chances due to 
exposure lo ionizing radiation 
and iomc chemicals 

Detailed subcellular 
fractionation studies 

Extensive siudics on regulation 
ol protein* h\ drugs and toxic 
agents 

Accessible via World Wide Web: 
quantitative spot measurements 
under dificrcnt conditions 

Accessible vi;i World Wide Web: 
complcieh imecrated with 
SWISS-PROT and 
SW1SS-3DIMAGE 

Completeh crossrclcrcnccd 
orpanism database: VPD has 
extensive information on over 
35(H) proteins: YEPD has 
man\ identifications 



VanBogclen and Ncidhardt. 1991: 
VanBoizelcn a aL. 1992 



Baker a aL. 1992 
Cornell ei aL IS>94b 
Jungblut rtaL 1994 

Cell* rial.. 1990a 
Cch- a ul.. 1993 
Cell- and Olson 1994 

Latham r; aL. 1991 
Latham rt aL. 1^92 

Giometti. Tax lor and Tollmen. 1992 
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Anderson and Anderson. 1^91. 
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HochMrasser rt al.. 1992 
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Goia/ n aL. 1993 

GarreN rtaL. 1994 
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Figure 3. Computer processing of gel images. Shown is a wide pi range 2-D separation of human liver 
proteins, processed by Melanie software ( Appcl ct ai. l uo l 1 (A) Original gel image as captured by laser 
densitometer. (BiGel image after processing to remove streaking and background. id Outline definition 
ol all spots on the gel. 
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no proieii. stain is able consistently to detect proteins over a wi<le range of concentra- 
tions, isoelectric points and amino acjd compositions, and with a varietx of 
posi-translational modifications (Goldberg eial.. 1988; Li cm!.. 1989*. Furthermore, 
there are large differences in staining pattern when identical gels or blo.s are subjected 
to differ:ni stains, including amido black, imidazole zinc, india ink. ponceau S. 
colloidal gold, or coomassie blue (Tovey. Ford and Baldo. 19R7: Ortiz ct <//.. 1992). 
The mo t common means of quantitating large numbers of protein- in a 2-D eel 
involves the radiolabclling of protein samples prior to electrophoresis, and protein 
quanrtaiion based on fluorography and image analysis or liquid scintillation countinc 
(GarreN. 1989: Celis and Olsen, 1994). However, proteins which do not contain 
methiom.ie cannot be detected if only pS] methionine is used for la^ellinc. Amino 
acid analysis of protein spots visualised by other techniques presents a likelv means of 
protein quantitation for the future. 

BLOTTING OF PROTEINS TO MEMBRANES 

Electrophoretic blotting of proteins from two-dimensional polyacrylamide eels to 
membranes presents many options for protein identification and microcharacitrisation 
- which are not possible when proteins remain in gels. For example, when proteins are 
blotted to polyvinylidene difluoride (PVDF) membranes, they can be identified by N- 
terminal sequencing, amino acid analysis, or immunoblotting. or they may be subjected 
to endoproteinase digestion, monosaccharide analysis, phosphate analvsis. or direct 
matrix-assisted laser desorption ionisation mass spectrometry (Matsudaira. 1987; 
Wilkins etaL. 1995: Jungblutr/r//.. 1994; Sutton etaL. 1995; Rasmussenm//.. 1994; 
Weizlhandler ct <//.. 1993; Murthy and Iqbal. 1991; Eckerskorn et aL. 1992). It is 
possible to combine of some of these procedures on a single protein spot on a PVDF 
membrane (Packer et al„ 1995: Wilkins ct <//.. submitted: Weizlhandler et <//.. 1993). * 
This is useful when minimal amounts of protein are available for analvsis. These 
techniques will be explored in detail later in this review. Notwithstanding the above, 
there are some disadvantages associated with blotting of proteins to membranes. 
There is always loss of sample during blotting procedures i Eckerskorn and Lottspcich. 
1993). and common protein detection methods are less sensitive or not applicable to 
membranes [Table / ). presenting difficulties for the analysis of low abundance 
proteins. Detailed discussion of the merits of available membranes and common 
blotting techniques can be found elsewhere ( Eckerskorn and Lotispeich. 1 993: Strupat 
et <//.. 1994; Patterson. 1994). 



2-D gel analysis, documentation, and proteome databases 

Following protein electrophoresis and detection, detailed analvsis of gel imaces is 
undertaken with computer systems. For proteome projects, the aim of this analvsis is 
to catalogue all spots from the 2-D gel in a qualitative and if possible quantitative 
manner, so as to define the number of proteins present and their levels of expression. 
Reference gel images, constructed from one or more gels, form the basis of two- 
dimensional gel databases. These databases also contain protein spot identities and 
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Noi withstanding ihc advances described ab r ve. there is jn increasing demand to 
improve the reproducibility of 2-D electrophoresis to facili f ate database construction 
and proteome studies. Harrington et al. (1993) explain that if a gel resolves 4000 
protein spots, and there is 99.59r spot matching from gel to gel. this will produce 20 
spot errors per gel. This amount of error, which might accumulate with each gel to gel 
comparison used in database construction, could produce m unacceptable degree of 
uncertainty in gel databases. To address these issues, panid automation of large 2-D 
gel separations has been undertaken (Nokihara. Moritaand Kuriki. 1992; Harrington 
ct aL. 1993). Although results are preliminary, spot to spot positional reproducibility 
in one study was found to be threefold improved over manual methods (Harrington ct 
al.. 1993). h should be noted that small 2-D gel formats :50 x 43 mm) have been 
almost completely automated (Brewer et al.. 1986). although these are not generally 
used for database studies. 

MICROPREPARATIVE 2-D GEL ELECTROPHORESIS 

With the advent of affordable protein microcharacterisation techniques, including N- 
terminal microsequencing. amino acid analysis, peptide mass fingerprinting, phosphate 
analysis and monosaccharide compositional analysis, a new challenge for 2-D electro- 
phoresis has been to maintain high resolution and reproducibility but to provide 
protein in sufficient quantities for chemical analysis (high nanogram to low microgram 
quantities of proteins per spot). This becomes difficult to achieve with very complex 
samples such as whole bacterial cells, as the initial protein load is divided among 2000 
to 4000 protein species. Two approaches are used for producing amounts of material 
that can be chemically characterised. The first method is to run multiple gels, collect 
and pool the spots of interest, and subject them to concentration (Ji ct al.. 1 994; Walsh 
rial.. 1995: Rasmussenr/a/.. 1992). In this approach, the concentration process must 
also act as a purification step to remove accumulated electrophorctic contaminants 
such as glycine. A more elegant approach has been to exploit the high loading capacity 
of IPG isoelectric focusing. The high loading capacity of immobilised pH gradients 
was described early <Ek. Bjellqvist and Righetti. 1983). but has only recently been 
applied to 2-D electrophoresis ( Hanash ct al.. 1 99 1 : Bjellqvist ct ai. 1 993b). Up to 1 5 
mg of protein can been applied to a single gel. yielding microgram quantities of hun- 
dreds of protein species. A further benefit of this approach is that proteins present in 
low abundance, which may not be visualised by lower protein loads, are more likelv 
to be detected. The use of electrophorctic or chromatographic prefractionation tech- 
niques (Hochstrasserc/ uL. 1991a: Harrington ct aL. 1992). followed by high loading 
of narrow-range IPG separations (Bjellqvist ct ai. 1993b) provides a likely solution u\ 
studies on proteins present in low abundance. 

Methods of protein detection 

There are many means for detecting proteins from 2-D gels. The method used will be 
dictated by factors including protein load on gel (analytical or preparative), the 
purpose of the gel (for protein quantitation or for blotting and chemical characterisa- 
tion), and the sensitivity required. The most common means of protein detection and 
their applications are shown in Table L Most detection methods have drawbacks, for 
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viial to allow comparison of gels from day 10 day and hervcen research sites These 
factors can be difficult to achieve. 

Carrier ampholytes are a common means of isoelectric focusing for the first 
dimension of 2-D electrophoresis. Gels are usually focused to equilibrium to separate 
proteins in the pi range 4 to 8. and run in a non-equilibrium mode (NEPHGEi to 
sepaiate proteins of higher pi (7 to 1 1.5) (OTarcell. 1975: OTanell. Goodman and 
O'FurrclI. 1977). Unfortunately, the use of carrier ampholytes in the isoelectric 
focusing procedure is susceptible to 'cathode drift', whereby pH gradients established 
b> p efocusing of ampholytes slowly change with lime (Righetti and Dry sdale. 1 9 T 3 >. 
Carr.er ampholyte pH gradients are also distorted by high sait concentration of 
samples t Bjellqvist */<//.. 1982). and by high protein load (O'FarrtM. 1975). A further 
limitation is that iso electric focusing gels, which are cast and subject to electrophore- 
sis in narrow glass tubes, need to be extruded by mechanical means before application 
to the second dimension - a procedure that potentially distorts the gel. Nevertheless, 
many of the above shortcomings can be avoided by loading small amounts of U C or :4 S 
radiolabeled samples (Garrels. 1989: Neidhardt et aL 1989; Vandekcrkhove et <//.. 
1990). High sensitivity detection is then achieved through use of fiuorogriiphy or 
phosphorimaging plates (Bonner and Laskey. 1974; Johnston. Pickett and Barker. 
1990: Patterson and Latter. 1993). However, this approach is only practicable for 
organisms or tissues that can be radiolabeled. 

An alternative technique, which is becoming the method of choice for the first 
dimension separation of proteins, involves isoelectric focusing in immobilized pH 
gradient ( IPG ) gels ( Bjellq vist et aL 1 982: Gorg. Postel and Gunther. 1 988: Risihetti. 
1990). Immobilized pH gradients are formed by the covalent coupling of the pH 
gradient into an acrylamide matrix, creating a gradient that is completely stable with 
time. IPG gels are usually poured onto a stiff backing film, which is mechanicallv 
strong and provides easy gel handling (Ostergren. Eriksson and Bjellqvist. 1988). The 
major advantages of IPG separations are that they do not suffer from cathodic drift. - 
they allow focusing of basic and very acidic proteins to equilibrium. pH gradients can 
be precisely tailored (linear, stepwise, sigmoidal). and that separations over a vcr\ 
narrow pH range arc possible (0.05 pH units per cm) (Righetti. 1990; Bjellqvisi et aL. 
1982. 1993a: Sinha et aL 1990: Gorges//.. 1988: Gelfi ct aL. 1987: Gunther et aL. 
1988i. However, it is not currently possible to use IPG gels to separate \erv basic 
proteins of isoelectric point greater than 10. although this i> under development. 
Narrow pH range separations are useful to address problems of protein co-mieraiion 
in complex samples, allowing 'zooming in* on regions of a gel {Figure 2). IPG ncl 
strips are now commercially available, which begin to address the problems of intra- 
and inter-lab isoelectric focusing reproducibility. 

There are two means of electrophoresis for the second dimension separation of 
proteins: vertical slab gels and horizontal ultrathm gels (Gorg. Postel. and Gunther. 
1 988 >. Both are usually SDS-containing gradient gels of approximately 1 1 <7r to 1 5 ^ 
acrylamide. which separate proteins in the molecular mass range of 10 - l50kD. A 
stacking gel is not usually used with slab gels, but is necessary when using horizontal 
gel setups (Gorg. Postel and Gunther. 1988). Comparisons have shown that there is 
little or no difference in the reproducibility of electrophoresis using either approach 
(Corbett ct aL. 1994a). but commercially available vertical or horizontal precast gels 
will provide greater reproducibility for occasional users. For slab gel electrophoresis. 
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identify all cDNA species, and the approach does not easily allow a systematic 
screening. Analysis of gene e/.pression by the study of proteins present in j cell or 
tissue presents a favorable alternative. This can be achieved by use of two-dimensional 
(2-D) gel electrophoresis, qualitative computer image analysis, and protein identifi- 
cation techniques to create 'reference maps' of all detectable proteins. Such reference 
maps establish patterns of normal and abnormal gene expression in the organism, and 
allow the examination of some post-translational protein modifications which are 
functionally important for many proteins. It is possible to screen protein* systemati- 
cally from reference maps to establish their identities. 

To define protein-based gene expression analysis, the concept of the 'proteome* 
was recently proposed (Wilkinse/a/.. 1995: Wasingere////.. 1995). A proteome \< the 
entire PROTein complement expressed by a genOME or by a cell or tissue type. The 
concept of the proteome has some differences from that of the genome, as while there 
is only one definitive genome of an organism, the proteome is an entity which can 
chanee under different conditions, and can be dissimilar in different tissues of a single 
oreanism. A proteome nevertheless remains a direct product of a genome. Interest- 
insly. the number of proteins in a proteome can exceed the number of genes present, 
as protein products expressed by alternative gene splicing or with different post- 
translational modifications are observed as separate molecules on a 2-D gel. As an 
extrapolation of the concept of the 'genome project*, a 'proteome project* is research 
which seeks to identify and characterise the proteins present in a cell or tissue and 
define their patterns of expression. 

Proteome projects present challenges of a similar magnitude to that of genome 
projects. Technically, the 2-D gel electrophoresis must be reproducible and of high 
resolution, allowing the separation and detection of the thousands of proteins in a cell. 
Low copy number proteins should be detectable. There should be computer gel image 
analysis systems that can qualitatively and quantitatively catalog the electrophoretically 
separated proteins, to form reference maps. A range of rapid and reliable techniques # 
must be available for the identification and characterisation of proteins. As a conse- 
quence of a proteome project, protein databases must be assembled that contain 
reference information about proteins: such databases must be linked to genomic 
databases and protein reference maps. Databases should be widely accessible and easy 
to use. 

Recently, there have been many changes in the techniques and resources available 
for the analysis of proteomes. It is the aim of this chapter to discuss the status of the 
areas outlined above, and to review briefly the progress of some current proteome 
projects. 

Two-dimensional electrophoresis of proteomes 

Two dimensional ( 2-D ) gel electrophoresis involves the separation of proteins by their 
isoelectric point in the first dimension, then separation according to molecular weight 
by sodium dodecy) sulfate electrophoresis in the second dimension. Since first 
described (Klose. 1975: OTarrell. 1975: Scheele, 1975), it has become the method of 
choice for the separation of complex mixtures of proteins, albeit with many modifica- 
tions to the original techniques. 2-D electrophoresis forms the basis of proteome 
projects through separating proteins by their size and charge (Hochstrasser e\ a!.. 
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ABSTRACT Analysis of cellular protein patterns by 
computer-aided 2-dimensional gel electrophoresis together 
with recent advances in protein sequence analysis have 
made possible the establishment of comprehensive 
2-dimcnsional gel protein databases that may link pro- 
tein and DNA information and that offer a global ap- 
proach to the study of the cell. Using the integrated ap- 
proach offered by 2-dimensional gel protein databases it 
is now possible to reveal phenotype specific protein (or 
proteins), to microsequence them, to search for homology 
with previously identified proteins, to clone the cDNAs, 
to assign partial protein sequence to genes for which the 
full DNA sequence and the chromosome location is 
known, and to study the regulatory properties and func- 
tion of groups of proteins that are coordinately expressed 
in a given biological process. Human 2-dimensional gel 
protein databases are becoming increasingly important in 
view of the concerted effort to map and sequence the en- 
tire genome. Celis, J. E.; Rasmussen, H. H.; Leffers, 

H. ; Madsen, P.; Honore, B.; Gesser, B.; Dejgaard, K.; 
\andekerckhove, J. Human cellular protein patterns and 
their link to genome DNA sequence data: usefulness of 
S^"*™ ns,onal gel electrophoresis and microsequencine. 
FASEBJ. 5: 2200-2208; 1991. 

Key Words: human protein patterns • 2 -dimensional gel protein 
databases * gene expression • microsequencing • cDNA cloning 
' linking protein and DNA information • genome mapping and se- 
quencing 



Proteins synthesized from information contained in the 
DNA orchestrate most cellular functions. The total number 
of proteins synthesized by a typical human cell is unknown 
although current estimates range from 3000 to 6000. Of 
these, as many as 70% may perform household functions 
and are expected to be shared by all eel] types irrespective of 
their origin. There are manv different cell types in the hu- 
man body with perhaps 30,000 to 50,000 proteins expressed 
in the organism as a whole judged from the fact that about 
/c of the haploid genome correspond to genes. Todav only 
a small fraction of the total set of proteins has been identified, 
and little is known about the protein patterns of individual 
cell types or their variation under physiological and abnor- 
mal conditions. 

For the past 15 years, high resolution 2-dimensional gel 
electrophoresis has been the technique of choice to deter- 
mine the protein composition of a given cell type and for 
monitoring changes in gene activity through quantitative 
and qualitative analysis of the thousands of proteins that or- 
chestrate various cellular functions (refs 1-6 and references 



therein). The technique originally described bv OTarrell i 
separates proteins in terms of their isoelectric point (pi) am 
molecular weight. Usually one chooses a condition of in- 
terest and the cell reveals the global protein behavioral 
response as all detected proteins can be analvzed both 
qualitatively and quantitatively in relation to each other. At 
present, most available 2-dimensional gel techniques (regu- 
lar gel format) can resolve between 1000 and 2000 proteins 
from a given mammalian cell type, a number that cor- 
responds to about 2 million base pairs of coded DNA. Les> 
abundant proteins can be detected bv analyzing partial! 
purified cellular fractions. 

Two-dimensional gel ectrophoresis has been widelv applied 
to analysis of cellular protein patterns from bacteria'to mam- 
malian cells (refs 1-6. and references therein). In spite of 
much work, however, information gathered from these 
studies has not reached the scientific community in its full- 
ness because of lack of standardized gel svstems and the lack 
of means for storing and communicating protein informa- 
tion. Only recently, because of the development of appropri- 
ate computer software (7-13). has it been possible to scar 
gels, assign numbers to individual proteins, and store tht 
wealth of information in quantitative and qualitative com- 
prehensive 2-dimensional gel protein databases (4, 14-23), 
i.e., those containing information about the various proper- 
ties (physical, chemical, biological, biochemical, physiologi- 
cal, genetic, immunological, architectural, £tc.) of all the 
proteins that can be detected in a given cell type. Such in- 
tegrated 2-dimensional gel protein databases offer an easy 
and standardized medium in which to store and communi- 
cate protein information and provide a unique framework in 
which to focus a multidisciplinary approach to study the cell 
Once a protein is identified in the database, all of the infor- 
mation accumulated can be easilv retrieved and made availa- 
ble to the researcher. In the long run, protein databases are 
expected to foster a wide variety of biological information 
that may be instrumental to researchers working in many 
areas of biology- among others, cancer and oncogene 
studies, differentiation, development, drug development and 
testing, genetic variation, and diagnosis of genetic and clini- 
cal diseases (Fig. 1). 

The approach using systematic 2-dimensional gel protein 
analysis has recently gained a new dimension with the ad- 
vent of techniques to microsequence major proteins recorded 
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F.gure 2. A) Synthetic .mage of a fraction of an IEF gel of the master image of AM A cellular proteins. B) As in A but showing number, 
assigned to each spot. Q Comparison of AMA (left) and normal human embrvonal lung MRC-5 fibroblasts (right) IEF proteins patterns 
Matched proteins are indicated by a + or by the same letters in both gels. Once a protein is matched, information contained in the various 
categories available in the master AMA database can be transferred. D) Synthetic image of a fraction of an IEF fluorogram of ["Slmethio- 
hir? =, ^n r ° ,ein f from "°™ al . h " m L an MRC-5 fibroblasts. The histograms show levels of synthesis of a few proteins in MRC-5 (left 
bar, and SV40 transformed MRC-D (right bar) fibroblasts. E) Polypeptides that contain information under the ca.cgorv glvcolvtic pathway 
?, p i T IT"* a f nn0,a ! ,0 , n f ° r S P°' allows ,he °P eralor <° about categories and information available for a given protein. 

, m p f bunda " Ce ° f ^'^keletal and cytoskeletal-related proteins in quiescent, proliferating, and SV40.,ransformed MRC-5 fibrob- 
lasts, rt) Polypeptides that contain information under the category partial amino acid sequences. 
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cross-matched experiments (18, 22). 

Once a standard map of a given protein sample is made, 
one can enter qualitative annotations to make a reference 
database. Our master 2-dimensional gel database of trans- 
formed human amnion cell (AMA) proteins (20) lists 3430 
polypeptides of which 2592 correspond to cellular compo- 
nents, having pi's ranging from 4 to 13 and molecular 
weights between 8.5 and 230 kDa. The most abundant pro- 
teins in the database correspond to total actin (3.87% of total 
protein; about 90 million molecules per cell) while the 
lesser abundant of the recorded polypeptides arc present in 
the vicinity of 5000 molecules per cell. Some annotation 
categories we are using to establish the master AMA data- 
base include: 1) protein identification (comigration with 
purified proteins, 2-dimensional immunoblotting, microse- 
quencing); 2) amounts (total amounts and levels of synthe- 
sis); 3) subcellular localization (nuclear, cytoskeletal, mem- 
brane, membrane receptors, specific organelles, etc.); 4) 
antibodies; 5) posttranslational modifications (phosphoryla- 
tion, glycosylation, methylation etc.); 6) microsequencing; 7) 
cell cycle specificity (specific variations in levels of synthesis 
and amount); 8) regulatory behavior (effect of hormones, 
growth factors, heat shock, etc.) 9) rate of synthesis in nor- 
mal and transformed cells (proliferation sensitive proteins, 
cell cycle specific proteins, oncogenes, components of the 
pathway (or pathways) that control cell proliferation); 10) 
function (mainly from comigration with proteins of known 
function); 77) sets of proteins that are coordinately regulated 
(hierarchy of controls, differential gene expression in various 
cells, etc.); 12) cDNAs (cloned cDNAs); 13) proteins that are 
specific to a given disease (systematic comparison of protein 
patterns of fibroblast proteins from healthy and diseased in- 
dividuals); 14) expression and exploitation of transfected 
cDNAs; 15) pathways (metabolic, others); 75) gene localization 
(genetic and physical); 17) effect of microinjected antibody 
on patterns of protein synthesis; and 18) secreted proteins. 

Information entered for any spot in a given annotation 
category can be easily retrieved by asking the computer to 
display the information on the color screen. For example 
Fig. 2E shows a synthetic image of a NEPHGE gel (master 
AMA database) displaying the information contained under 
the entry glycolytic pathway. Alternatively, one can use the 
function peruse annotations for spot to direcdy ask the com- 
puter to list all the entries available for a particular protein. 
By clicking the mouse in a given entry (in this case, presence 
in fetal human tissues) it is possible to take a quick look at 
the information in that particular entry (Fig. 2F). 

A major obstacle encountered in building comprehensive 
2-dimensional gel protein databases is identifying the large 
number of proteins separated by this technology. In our 
databases (20, 21), known proteins are identified by one or 
a combination of the following procedures: 7) comigration 
with known proteins, 2) 2-dimensional gel immunoblotting 
using specific antibodies, and 3) microsequencing of 
Coomassie Brillant Blue stained human proteins recovered 
from dried 2-dimensional gels (see next section). Protein 
identification by means of microsequencing may be difficult, 
as individual protein members of families with short peptide 
differences may escape detection. In the gene-protein data- 
base of E. «ff K-12 (14, 23), another major 2-dimensional gel 
database available at present, proteins are being identified by 
a wider range of tests that include comigration with purified 
proteins; genetic criterion (deletion, insertion, frameshift, 
nonsense, missense, regulatory), plasmid- bearing strains 
and in vitro synthesis of protein; selective labeling (methyla- 
tion, phosphorylation); peptide map similarity; and physio- 
Jogical criterion and selective derivatization 



So far we have received nearly 550 antibodies from labora- 
tories all over the world and these are being systematically 
tested by 2-dimensional gel immunoblotting for antigen de- 
termination. Similarly, purified proteins and organelles 
provided by several laboratories have greadv aided identifica- 
tion of unknown proteins (20721). We routinely request anti- 
bodies and protein samples and promise the donors to make 
available all the information we may have accumulated on thai 
particular protein. For example, fable 1 lists entries availa- 
ble for Lipocortin V (IEF SSP 8216), also known as annexin 
V, VAC-o; endonexin II, renoconin, chromobindin-5', an- 
ticoagulant protein, PAP-I, rcalcimedin, IBC, calphobindin, 
and anchorin CII. 

As mentioned previously, one distinct advantage of 
2-dimensional gel electrophoresis is the possibility of studv- 
mg quantitative variations in cellular protein patterns that 
may lead to identification of groups of proteins that are ex- 
pressed coordinately during a given biological process. 
Quantitation, however, is not an easy task as reflected by the 
lack of published data on global cellular protein patterns. We 
believe this is partly due to difficulties in obtaining sets of 
gels that are suitable for computer analysis (streaking 
material remaining at the origin, etc.) as well as to limita- 
tions (laborious editing time, need of calibration strips to 
merge images, limited dynamic range, etc.) in the computer 
analysis systems available at the moment. Perhaps the most 
advanced quantitative studies published so far using com- 
puter analysis have been carried out by Garrels and co- 
workers (18, 22). In particular, these investigators have estab- 
lished a quantitative rat protein database (18, 22) designed 
to study growth control (proliferation, growth inhibitors, and 
stimulation) and transformation in well-defined groups of 
cell lines obtained by transformation of rat REF52 cells with 
SV40, adenovirus, and the Kirsten murine sarcoma virus. 
These studies have revealed clusters of proteins induced or 
repressed during growth to confluence as well as groups of 
transformation-sensitive proteins that respond in a differen- 
tial fashion to transformation by DNA and RNA viruses. A 
most interesting feature of this quantitative database is the 
discovery of a group of coregulated proteins that show simi- 
lar expression patterns as the cell cycle- regulated DNA repli- 
cation protein known as proliferating cell nuclear antieen 
(PCNA)/cyclin (45). 5 

In our human databases, most quantitations have been 
earned out by estimating the radioactivity contained in the 
polypeptides by direct counting of the gel pieces in a scintil- 
lation counter (20, 21). Up to 700 proteins can be cut out 
through appropriate exposed films in a period of time com- 
parable to that required for editing a synthetic image. 
Manual quantitation of this large number of spots is difficult 
without the assistance of a master reference image and a 
numbering system that can be used to identify the spots. Us- 
ing this approach, we have recorded quantitative changes in 
the relative abundance of 592 [ 35 S]methionine-labeled pro- 
teins synthesized by quiescent, proliferating, and SV40 
transformed human embryonic lung MRC-5 fibroblasts (21). 
Some data concerning cytoskeletal and cytoskeletal-related 
proteins are presented in Fig. 2G. Our studies as well as 
those of Garrels and co-workers (18, 22) may in the long run 
help define patterns of gene expression that are characteristic 
of the transformed state. 

OTHER 2-DIMENSIONAL GEL PROTEIN 
DATABASES 



As mentioned previously there are other 2-dimensional gel 
databases available in computer form that have been pub- 
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phocytes, leukocytes, leukemic cells) mouse (NIH/3T3 cells, 
T lymphocytes), Aplysia. yeast {Saccharomyces cerevisae), plants 
(wheat, barley, sorghum), and Euglena. Databases of tissue 
protein, (brain, whole mouse, liver) and body fluid proteins 
(plasma proteins, cerebrospinal fluid, urine, and milk) are 
being established in several laboratories. The reader is 
directed to the review by Celis et al. (4) for details and refer- 
ences concerning these databases. 



MICROSEQUENCING HAS ADDED A NEW 
DIMENSION TO COMPREHENSIVE 
2-DIMENSIONAL GEL DATABASES: A DIRECT 
LINK BETWEEN PROTEINS AND GENES 

The development of highly sensitive amino acid gas-phase or 
liquid-phase sequenators (24), together with the establish- 
ment of efficient protein and peptide sample preparation 
methods, has opened the possibility to perform a systematic 
sequence analysis of proteins resolved by 2-dimensional gel 
electrophoresis. Indeed, generated pieces of protein se- 
quences can be used to search for protein identity (compari- 
son with available sequences stored in databanks) as well as 
for preparing specific DNA probes for cloning of as yet un- 
characterized proteins (Fig. 1). In addition, partial protein 
sequences can be stored in 2-dimensionaJ gel databases (for 
example, see Fig. 2H) and offer a unique link between pro- 
teins and genes (Fig. 1). 

In the early 1970s gel electrophoresis was used to purify 
proteins for sequencing purposes (reviewed by Weber and 
Osborn in ref 25). Proteins were recovered by diffusion and 
sequenced by the manual dansyl-Edman degradation at the 
nanomole level. This technique was further refined by using 
electro-elution to recover proteins and by miniaturizing the 
system (26). This method has been used extensively, but 
showed increasing drawbacks (low yields, protein samples 
contaminated by free amino acids, and NH 2 -terminal block- 
ing) as the amounts of handled protein gradually became 
smaller (e.g., at the 10 picomol level). 

Most of the problems referred to above have been 
minimized with the introduction of proteiri-electroblotting 
procedures (27-32). When proteins are blotted on chemi- 
cally inert membranes, it is possible to sequence the immobi- 
lized proteins directly without additional manipulations. 
Thus, depending on the amount of bound protein and its na- 
ture, this direct sequencing procedure generally yields NH 2 - 
terminal sequences containing 10-40 residues. As such, this 
technique was used to identify, by their NH 2 -terminal se- 
quences, differentially expressed major proteins from total 
cellular extracts, separated on 2-dimensional gels. A major 
difficulty encountered in this procedure is the occurrence of 
frequent artefactual blockage of the proteins. Several studies 
suggest that this phenomenon is mainly due to reaction with 
contaminants (particularly unpolymerized acrylamide 
present in the gel) and to a high dilution of the protein (low 
concentration of the protein per unit membrane surface). In 
addition to this primarily technical problem, many proteins 
are blocked in vivo by acylation or by a pyrrolidon carboxylic 
acid cap. 

The problem of partial or complete NH 2 -terminal block- 
age can be circumvented by generating internal amino acid 
sequences. This is achieved by fragmenting the protein 
present in the gel (gel in situ cleavage) or by cleaving it while 
bound to the membrane (membrane in situ cleavage) 
(33-35). In both cases, proteins are either cleaved in a res- 
tricted way (e.g., by limited enzymatic digestion or by using 
restriction chemical cleavage conditions) or fragmented into 
smaller peptides. 
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Of the different combinations examined, we had sood 
results by using exhaustive proteolytic digestion^ on 
membrane-immobilized proteins. This method has been 
described for Ponceau red-stained proteins on nitrocellulose 
blots (34), for Amido-blac^stained Immobilon-bound pn 
teins, and for fluorescamine^detected proteins on glass fibt 
membranes (35). The proteases used (try psin, chymotrvpsin. 
or pepsin) cleave at multiple sites, generating small peptides 
that elute from the blot into the digestion buffer from which 
they are purified by reversed -phase high performance liquid 
chromatography (HPLC) before being sequenced individu- 
ally. Although each of these manipulations could be expected 
to result in a reduced yield of final sequence information, we 
were surprised that the peptides could be sequenced with 
high efficiency. In our hands, this approach could be rou- 
tinely applied to gel-purified proteins available in amount 
ranging from 5 to 10 fig. and often yielded sequence informa- 
tion covering more than 30% of the total protein. As 
membrane-immobilized proteins are not homogeneously 
digested, but rather show protease sensitivity next to resis- 
tant regions, the number of peptides generated is much lower 
than expected from the number of potential cleavage sites. 
Consequently, HPLC peptide chrorhatograms are less com- 
plex and most peptides can be recovered in pure form. 

As only limited amounts of a protein mixture can be 
loaded on a 2-dimensional gel, proteins of interest are often 
obtained in yields insufficient for the currently available se- 
quencing technology. More material can be obtained by en- 
riching for a certain subcellular fraction (purified cell or- 
ganelles) or by exploiting affinity (dyes, metals, drugs, etc) or 
hydrophobic properties of proteins before gel analysis. All of 
the sequencing results accumulated so far in the human pro- 
tein database (20) (a few are shown in Fig. 2H) have been 
obtained from analysis of protein spots collected from 
2-dimensional gels that had been stained with Coomassie 
blue according to standard procedures and dried for storage. 
Proteins are recovered from the collected gel pieces by a 
protein-elution-concentration device, combined with gel 
electrophoresis and electroblotting. Details of this technique 
have been reported in a previous communication (42) and a 
brief outline is given below. 

Combined gel pieces are allowed to swell in gel sample 
buffer (a total volume of 1.5 ml). The gel 'pieces combined 
with the supernatant are then collected into a large slot made 
in a new gel. The slot is further filled with Sephadex G-10 
equilibrated in gel sample buffer. During consecutive gel 
electrophoresis, most of the electrical current passes on the 
side of the slot instead of passing through the slot. This 
results in both a vertical stacking and horizontal contraction 
of the protein band. With this device the protein is efficiently 
eluted from the gel pieces and concentrated from a large 
volume into a narrow spot. The highly concentrated (about 
5 mm 2 ) protein spot is then electroblotted on PVDF- 
membranes, stained with Amido black, and in situ digested 
with trypsin. The peptides generated during digestion elute 
from the membrane into the supernatant, and can be sepa- 
rated by narrow bore reversed -phase HPLC and collected in- 
dividually for sequence analysis. 

Using this and previous procedures (37, 39, 42), we have 
so far analyzed 70 protein spots collected from 
2-dimensional gels (20, and unpublished observations) (see 
for example Fig. 2H). The sequence information amounts to 
2100 allocated residues corresponding to an average of 30 
residues per protein spot. So far we have made cDNAs of 
many of the unknown proteins that have been microse- 
quenced, and a substantial number has been cloned and se- 
quenced. All available information indicates that it may be 
possible to obtain partial sequence information from most of 
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the proteins that can be visualized by Coomassie Brillant 
Blue staining. 

Partial protein sequences are stored in the database as dis- 
played in Fig. 2H, and it should be possible in the near fu- 
ture to interface this information with forthcoming DNA se- 
quence data from the human genome project. In the long 
run. as the human genome sequences become available it 
will be possible to assign partial protein sequences to genes 
tor which the full DNA sequence and chromosomal location 
are known (Fig, 1). 



SUMMARY 

The studies presented in this brief review are intended to 
demonstrate the usefulness of computer-aided 2-dimensional 
gel electrophoresis and microsequencing to analyze cellular 
protein patterns, and to link protein and DNA information. 
As more information is gathered worldwide, comprehensive 
databases will depict an integrated picture of the expression 
ievels and properties of the thousands of proteins that orches- 
trate most cellular functions. 

Clearly, databases allow easy access to a large body of data 
and provide an efficient, medium to communicate stan- 
dardized protein information. In the future, databases will 
foster a wide variety of biological information that can be 
used to support collaborative research projects in basic and 
applied biology as well as in clinical research (2, 5. 46). Once 
a protein is identified in a particular database all the infor- 
nation gathered on it can be made available to the scientist. 
However, many problems must be solved before protein 
databases become of general use to the scientific community. 
A most urgent one is to promote standardization of the gel 
running conditions so that data produced in a given labora- 
tory may be used worldwide. Surprisingly, the gel running 
technology as it stands today is still a craftmanship art. 

Finally, comprehensive, computerized databases of pro- 
teins, together with recently developed techniques to 
microsequence proteins, offer a new dimension to the study 
of genome organization and function (Fig. 1). In particular, 
human protein databases may become increasingly impor- 
tant in view of the concerted effort to map and sequence the 
entire human genome. This formidable task is expected to 
dominate biological research in the next decades. [5] 
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TABLE 1. Some entries for lipocortin V in the human AMA 2-dimensional gel protein database 
Entries for lipocortin V (1EF SSP 82 16> 



Information entered 



1. Protein name 

2. Percentage of total protein 

:i. Apparent molecular weight (mr) 

4. Isoelectric point (pi) 

5. Method (or methods) of identification 

6. Credit to investigators that aided in 

identification 

7. Antibody against protein 

8. Comigration with human proteins 

9. Cellular localization 

1 0. Calcium/phospholipid-dependcnt 

membrane proteins 

1 1 . Function 

12. Partial amino acid sequence 

13. cD.VA sequence 

14. Levels in fetal human tissues 



15. Levels in quiescent, proliferating, and 
transformed MRC-5 fibroblasts 

In. Distribution in Triton supernatant and 
cvtoskelctons 



Lipocortin V. renoconin. chromobindin-5\ endonexin I, anticoagulant protein 
PAP-I. VAC-a. 35-Y-caJcimedin. IBC. calphobindin L anchorin CII. annexin V 
0.110% (about 2.800.000 molecules per cell) 

33.3 kDa 
4.76 

Microsequencing. 2-dimensionaJ immunoblotting. Comigration 

^ a " w -J- Vandekerckhovc. and colleagues. Rijksuniversitcit Gent: B Pepin^kv 
BIOGEX. Cambridge; N.G. Ahn. University of Washington 

Polyclonal (rabbit, antibody no. 20). B. Pepinsky. BIOGEN. Cambridge 
Lipocortin V.N.G. Ahn. Howard Hughes Medical Institute. Washington Unixvrsiiv 
Subcortical membrane 
Lipocortin V 

Regulation of various aspects of inflammation, immune response, blood coagulation 
and differentiation 

GTVTDFPGFDER (7-18). VLTEIIASR (109-117). QVYEEEVGSSLEDDVYC; 
(127-143). ?GTDEEKFITIFGT(R) (187-201) ° 

Known. R. Blake et al.,y. BioL Chcm. 263. 10799-10811; 1988 
(pi « 4.76 from translated sequence) 

Adrenal glands = + + + ; brain - + + + ; 

cerebellum - + + + ; ear « + + + ; eve » + + + ; 

heart - + + + ; hypophysis » + + + ; liver * + + + ; 

lung - + + + ; meninges « + + + ; 

mesonephric tissue « + + + ; 

striated muscle - + + + ; pancreas « + + + ; 

skin « + + + ; spleen - + + + ; stomach = + + + ; 

submandibular gland «= + + + ; 

small intestine « + + + ; thymus * + + + ; 

thyroid gland « + + + ; tongue = + + + ; 

ureter « + + + 

Q (quiescent) = 1.1; P (proliferating) =1.0; 
T (SV40 transformed) ~ 0.3 

Mainly supernatant 



lished in extenso: these correspond to the E. coli K-12 
protein-gene database (14. 23) and to the rat REF52 data- 
base (18. 22). 

The E. coli K-12 cellular protein-gene database is perhaps 
the most complete of all databases reported so far and even- 
tually it should trace each protein back to its structural gene. 
Information contained in this database includes: gene/pro- 
tein name (protein name, EC number, gene name); 
2-dimensionaI gel spot designations (x-y coordinates from 
reference gels, alphanumeric designation); genetic informa- 
tion (linkage map location, physical map location, Genebank 
code, sequence reference, location on Kohara clones): bi- 
ochemical information (molecular weight, pi, number of 
residues of each amino acid, mole percent of each amino 
acid, total number of amino acids in a polypeptide), and 
regulatory information (cellular level of protein in different 
media and different temperature, member of regufon, mem- 
ber of stimulon). Major advances of this database are en- 
visaged in the future in view of the eminent sequencing of 



the whole E. coli genome as well as the development of im- 
proved methods to express cloned genes. 

The rat REF52 2-dimensional gel protein database lists 
about 1600 proteins that have been recorded using the 
QUEST analysis system (18, 22). Included in this quantita- 
tive database are /) protein names (cytoskeletal and heat 
shock proteins as well as various nuclear, mitochondrial, and 
cytoplasmic proteins), 2) annotations (subcellular localiza- 
tion, modification, recognition by specific antibodies, 
coprecipitation, NH 2 -terminal sequence, cross-reference to 
protein sequence information and references to the litera- 
ture), 3) protein sets (cytoskeletal proteins, phosphoproteins, 
sets of proteins with PCNA/cyclin-like properties, etc.) and 
4) general quantitative data (protein synthesis during growth 
of normal REF52 cells to confluence and quiescence, and af- 
ter restimulation of growth-inhibited cells). 

In addition to the 2-dimensional gel databases mentioned 
so far there are several smaller cellular databases being es- 
tablished in human (normal human diploid fibroblasts. Ivm- 
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The automatic matching process that has been described 
in detail by Garrels et al. (12) takes about 5 min. Matched 
proteins are indicated with trie same letters in both gels (Fig. 
2C). The usefulness of this function is emphasized by the fact 
that data accumulated on common household* proteins can 
be easily transferred to any other human cellular cell type 
whose 2-dimensional gel cellular protein pattern is matched 



to our standard AMA 2-dimensional gel protein image. Al- 
ternatively, if the standard gel is part of a matchset (set of 
gels in a given experiment) it can be used as a linker gel to 
compare, for example, the quantitative values of a given pro- 
tein throughout the experiment (see Fig. 2D; levels of some 
proteins in normal and SV40 transformed human MRC-5 
fibroblasts) or with other standard images in different sets of 
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Figure 1. Interface between partial protein sequence databases, 
comprehensive 2-dimensionaI gel databases, and the human ge- 
nome sequencing project. Appropriate software is required to com- 
pare protein and DNA sequences. In general, although the infer- 
ence of a protein s sequence from the DNA sequence (thick arrow) 
is direct and unambiguous, the DNA sequence can only be inferred 
ipproximately from the protein sequence (thin arrow) and cloning 
)f the gene requires either a cDNA or the requisite group of 
jligonucleotide probes deduced from the partial amino acid se- 
quence. Modified from ref 6. 



in the databases (refs 24-42 and references therein). Partial 
protein sequences can be used to search for protein identity 
as well as to prepare specific DNA probes for cloning as-vet- 
uncharacterized proteins (Fig. 1). As these sequences can be 
stored in the database (see for example Fig. 2H), they offer 
i unique opportunity to link information on proteins with 
he existing or forthcoming DNA sequence data on the hu- 
man genome (Fig. 1) (20, 36, 39). 

Using the integrated approach offered by comprehensive 
2-dimensional gel databases (Fig. 1), it will be possible to 
identify phenotype-specific proteins; microsequence them 
and store the information in the database: search for homol- 
ogy with previously characterized proteins; clone the 
cDNAs. assign partial protein sequences to genes for which 
the full DNA sequence and the chromosome location are 
known, and study the regulatory properties and function of 
groups of proteins (pathways, organelles, etc.) that are coor- 
dinate^* expressed in a given biological process. Comprehen- 
sive 2-dimensional gel protein databases will depict an in- 
tegrated picture of the expression levels and properties of the 
thousands of protein components of organelles, pathways, 
and cytoskeletal systems in both physiological and abnormal 
conditions and are expected to lead to identification of new 
regulatory networks in different cell types and organisms. In 
the future, 2-dimensional gel protein databases may be 
linked to each other as well as to national and international 
specialized databanks on nucleic acid and protein sequences, 
protein structures, NMR experimental data, complex carbo- 
hydrates, etc. 

A few 2-dimensional gel protein databases that are accessible 
in a computer form have been published in extenso: these 
correspond to the protein-gene database of Escherichia colt 
K-12 developed by Neidhardt and colleagues (14. 23), the rat 
REF 52 database established by Garrels and co-workers at 
Cold Spring Harbor (18. 22). and a few human databases 
(transformed amnion cells [15. 20], normal embrvonal lung 
MRC-5 fibroblasts [17, 21], keratinoevtes (19) and peripheral 
blood mononuclear ceils [15]) developed in Aarhus. Given 
space limitations and to keep this review in focus, we will 
concentrate on the computerized analysis of human cellular 
2-dimensional gel patterns, and in particular on the steps in- 
volved in establishing comprehensive 2-dimensional gel 
databases that can link protein and DNA information. 



MAKING AND MANAGING A COMPREHENSIYF 
2-DIMENSIONAL GEL DATABASE OF HUM AN ' 
CELLULAR PROTEINS 

The first step in making a comprehensive 2-dimensional cc: 
protein database is to prepare a synthetic image (digital ion:: 
of the gel image) of the gel (fluorogram. Coomassie blue or sil- 
ver stained gel) to be used as a standard or master reference. 
This can be done with laser scanners, charge couple device 
(CCD) 2 array scanners, television cameras, rotating drum 
scanners, and multiwire chambers (13). Computerized anal- 
ysis systems for spot detection, quantitation, pattern match- 
ing, and data handling (access and retrieval of information, 
database making) have been described in the literature 
(ELSIE [43], GELLAB [II], HERMeS [44], MELAXIE 
[10], QUEST (9), and TYCHO [8]) and some arc available 
commercially (PDQUEST, Protein Database Inc.. Hunting- 
ton, NY.; KEPLER, Large Scale Biology, Rockville. Md.; 
Visage, Biolmage Corporation, Ann Arbor, Mich.; Gemini. 
Joyce Loebl, Gateshead; Microscan 1000, Technologv 
Resources Inc., Nashville, Tenn. and MasterScan, Billerica. 
Mass.). Unfortunately, most of these systems are incompati- 
ble with one another and their advantages and disadvantages 
have been discussed by Miller (13). 

In our work station in Aarhus, fluorograms are scanned 
with a Molecular Dynamics laser scanner and the data are 
analyzed using the PDQUEST II software (Protein Data- 
bases Inc.) (12) running on a spark station computer 4100 
FC-8-P3 from SUN Microsystems, Inc. The scanner meas- 
ures intensity in the range of 0-2.0 absorbance. A typical 
scan of a 17 x 17 cm fluorogram takes about 2 min. Steps 
in image analysis include: initial smoothing, background 
substraction, final smoothing, spot detection, and fitting of 
ideal Gaussian distribution to spot centers. Spot intensity is 
calculated as the integration of a fitted Gaussian. If calibra- 
tion strips containing individual segments of a known 
amount of radioactivity are used, it is possible to merge mul- 
tiple exposures of the sample image into a single data image 
of greater dynamic range. Once the synthetic image is 
created it can be stored on disk and displayed directly on the 
monitor. Functions that can be used to edit the images in- 
clude: cancel (for example, to erase scratches that may have 
been interpreted as spots by the computer; cancel streaks or 
low dpm spots), combine (sometimes a spot may be resolved 
into several closely packed spots), restore, uncombine, and 
add spot to the gel. The process is time consuming -about 
1-1/2 day per image. Edited standard images can be matched 
to other synthetic images. Figure 2A shows a portion of a 
standard synthetic image (IEF) of a fluorogram of 
[ 35 S]methionine labeled cellular proteins from human AMA 
cells (master database) (20). Images can be displayed either 
in black and white (resembling the original fluorograms) or 
in color (other images in Fig. 2), depending on the need. As 
shown in Fig. 25, each polypeptide is assigned a number bv 
the computer, which facilitates the entry and retrieval of 
qualitative and quantitative information for any given spot 
m the gel (20). The standard image can be matched auto- 
matically by the computer to other standard or reference gels 
(Fig. 2C matching of AMA cellular proteins [left] to MRC-5 
proteins [right]) provided a few landmark spots are given 
manually as reference (indicated with a + in Fig. 2Cj to in- 
itiate the process. 



'Abbreviations: CCD. charge couple device: PCNA. proliferat- 
ing cell nuclear antigen; HPLC. high performance liquid chromatog- 
raphy. 
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Nonenzymatic extraction of cells from clinical tumor 
material for analysis of gene expression by two- 
dimensional polyacrylamide gel electrophoresis 

We have compared different methods of preparation of malignant cells for 
two-dimensional electrophoresis (2-DE). We found all methods usine fresh 
tissue to be superior compared to methods using frozen tissue. Our Results 
indicate that nonenzymatic methods of preparation of tumor cells, including 
tine needle aspiration, scraping and squeezing, have advamaees over methods 
using enzymatic extraction of cells. Nonenzymatic methods Vc rapid, appear 
to reduce loss of high molecular protein species, and alleviate the necessity of 
separating viable and nonviable cells by Percoil gradient centrifugation. Usine 
hese techniques, high-quality 2-DE maps were derived from tumors of the 
lung and breast. In the resulting polypeptide patterns, heat shock proteins 
non-muscle tropomyosins and intermediate filament were identified. We con- 
clude that nonenzymatic extraction of malignant cells from fresh tumor tissue 
improves the possibilities that these techniques mav be useful in clinical diag- 
nosis. ' & 



1 Introduction 

Tumors may develop by a number of different mechan- 
isms in any given cell type. At the time of diagnosis, 
tumors will have progressed along different pathways to 
various stages of malignancy. To provide a basis for indi- 
vidual therapy it is of importance to examine specific 
properties of the tumor cell population in each patient. 
A large number of different markers have been de- 
scribed in order to increase the diagnostic accuracy. It is 
likely that a combination of serveral markers is needed 
in the future in order to reflect different properties of 
the tumor. One important method for the resolution of a 
large number of potential markers is two-dimensional 
electrophoresis (2-DE). Extensive efforts are being made 
m identifying various polypeptides separated by 2-DE 
and to characterize how the expression of these polypep- 
tides is affected by the response to cellular transforma- 
tion and various culture conditions [1.2]. It would be of 
value to transfer this information to 2-DE separations of 
polypeptides from tumor tissue samples. However, one 
prerequisite is that the quality of the 2-DE gels from 
tumor samples is comparable in quality with 2-DE gels 
from samples of cultured cells. 

Frozen tumor tissues are commonly used for various bio- 
chemical assessments. However, if such samples are ana- 
lyzed by 2-D polyacrylamide gel electrophoresis (PAGE), 
the polypeptide patterns are obscured by contamination 
of serum- and connective tissue proteins. Such nontu- 
mor-cell-related variations represent serious problems in 
the interpretation and inter-patient comparison of 2-DE 
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patterns [3 J. 2-DE patterns of cells prepared from fresh 
tumor material were analyzed after enzymatic extraction 
of tumor cells [4, 5] or after culturing tumor fragments in 
medium containing radioactive amino acids (6], These 
procedures may, however, lead to alterations in the gene 
expression/polypeptide patterns. We are only aware of 
one study where nonenzymatic extraction of cells from 
fresh tumor tissue (prostate cancer) was used to prepare 
samples for 2-D PAGE [4]. We have examined enzymatic 
extraction and various nonenzymatic preparation tech- 
niques, including fine needle aspiration, for the prepara- 
tion of cells from fresh tumor tissues. We describe 
nonenzymatic extraction procedures that are rapid, lead 
to high-quality 2-DE patterns, and that alleviate the 
necessity to purify tumor cell populations from dead 
cells. 

2 Materials and methods 

2.1 Cell cultures and samples used for spot 
identification 

A rat embryonal fibroblast cell line, WT2 (a kind gift 
from Dr. J. I. Garrels and Dr. S, Pattersson) was used for 
the identification of a number of heat shock and struc- 
tural proteins. Human normal diploid lung fibroblasts, 
WI38. human epithelial breast carcinoma cells, MDA- 
231 and MCF-7 were purchased from ATCC and grown 
as recommended. Polypeptides prepared from a leu- 
kemia type pre-B-ALL were separated by 2-DE. The 
2-DE map was then analyzed by Dr. S. M. Hanash (Uni- 
versity of Michigan, Ann Arbor, USA). 
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2.2 Tumor tissues samples 

In this study. 2-DE maps from seven tumors were used 
as representative illustrations: two adenocarcinoma of 
the lung (LA, and LB, mucinous, both cases interme- 
diate grade of differentiation), one sqamous carcinoma 
of the lung (LS), one carcinoid-like breast cancer (BC) 
one microfollicular adenoma (highlv differentiated) of 
the thyroid (TA), one highly differentiated hyperneph- 
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PMSF (0.2 m.M. EDTA (1.0 mM), 0.5°o Nonidei P-40 
(NP-40), and 3-[3-cholamido propyl )-dimethvlammonio]- 
I-propane sulfonate (CHAPS; 25 mM) was added care- 
fully, mixed for 2.5 h and centrifuged for 15 min at 



10000 rpm to remove any insoluble material. Duplicate 
or triplicate samples were taken for protein determina- 
tion [11]. Samples were stored at — 80 tl C prior to isoelec- 
tric focusing (IEF). 
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cells (Fig. 2d). Polypeptides were identified through a 
laboratory exchange of cell samples/2-DE maps^and 
through 2-DE analysis of purified proteins (Table 1). 

3.2 Preparation of samples from solid tumors 
3.2.1 Fresh versus frozen tissue 

An adenocarcinoma of the lung (LA) was prepared for 
2-DE by conventional methods using frozen material 
(Fig. 3a j. There are several possibilities for the poor reso- 
lution using frozen tissue, including the presence of high 
molecular weight protein aggregates. Filtering extracts 
through 0.1 urn filters (Durapore. Millipore) resulted in 
a slightly improved resolution (not shown). When fresh 
tumor tissue from tumor LA was used for sample prepa- 
ration, using fine needle aspiration to collect the cells, 
the resolution was considerably improved (Fig. 3b). The 
use of fresh tissue resulted in a general increase in reso- 
lution, which was most pronounced in the 50—100 kDa 
molecular mass range. A number of differences in the 
protein profiles of the gels in Figs. 3a and 3b can be ob- 
served, some of which are indicated in the figures. The 
decrease in serum albumin in Fig. 3b is likely to result 
from loss of serum proteins occurring when cells were 
pelleted after aspiration. Other differences, such as the 
decreased level of transformation-sensitive tropomyosins 
(TM1-TM3). may result from enrichment of tumor cells 
in the sample of Fig. 3b. Fine needle aspiration, a well- 
established technique in cytology, extracts mainly tumor 
cells because of decreased intercellular adhesiveness of 
neoplastic cells as compared to normal tissue. Micros- 
copic examination of Dilf-Quick-stained extracted cells 
from case LA revealed almost 100% tumor cells, 
whereas the whole tissue extract contained approximate- 
ly 60°o tumor cells. 



Tible 1. Names and abbreviations for identified sp ots 

Spot Name Basis lor iden;i;;ca;ior. 

A Acuns a 

aA a/pAfl-Actinin a 

B23 Proiein B23 /Numatrin a 

EF2 Elongation factor 2 a 

EF1 Elongation factor 1 0 a 

GT Glutathione-S-transpherase {pi a 

hsp60 Heat shock protein 60 a 

hsp73 Heat shock protein 73 a 

hsp80 Heat shock protein 80. CRP78. BIP a 

hsp90 Heat shock protein 90 a 

hsplOO Heat shock protein 100. Endoplasmin a 

IFa Intermediary filament associated a 

k8 Cytokeratin 8 b and a 

LamB Lamm B a 

Lipl Lipocortin I a 

Lip2 Lipocortin II a 

Lip5 Lipocortin V a 

Mill Mitcon 1/0 - Fl ATPase a 

Mit2 Mitcon 2 a 

Mii3 Mitcon 3 a 

MRP Mucine Related Polypeptides - 

pcna Ploliferating cell nuclear antigen c and a 

PLC Phospholipase C (1) " a 

RO RO/SS-A antigen a 

SA Serum Albumin b and a 

aT fl/p/ifl-Tubuiin a 

bT 6«/ia-Tubulin a 

tml Non-muscle tropomyosin isoform 1 b and a 

tm2 Non-muscle tropomyosin isoferm 2 b and a 

tm3 * Non*muscle tropomyosin isoferm 3 b and a 

tm4 Non-muscle tropomyosin isoform 4 b and a 

tm5 Non-muscle tropomyosin isoform 5 b and a 

TPI Triose phosphate isomerase a 

V Vimentin b and a 

VidI Vimentin derived proiein b and a 

Vid2 Vimentin derived protein b and a 

Vid3 Vimentin derived protein b and a 

Vid4 Vimentin derived protein b and a 

Vin Vinculin a 

a. homologous position with respect to other mammalian systems 

b. purified protein(s) 

c. immunoblotting 
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Figure 4. 2-DE analysis' of a case of breast carcinoma iBC). Comparison of 2-DE quality and some differences in detected spots larrow 
heads indicate increased intensity ;i nd circles or bracket indicate decreased intensity of the same spots) between (A> enzymatically and (B) 

noncnzymaiicalh i scraped) tissue preparation. 
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difference in intensity were lower than when a nonenzy- 
matic preparation was compared with an enzvmatic pre- 
paration. 

2-DE maps of satisfactory quality were prepared by a 
third procedure. Cells were released from small pieces of 
tumor by squeezing (see Section 2). Some examples of 
this are shown in Fig. 6 where 2-DE maps derived from 
a case of hypernephroma. KH (Fig. 6ah a case of thyroid 
tumor. TA (Fig. 6b) and a case of corpus cancer, CP (Fig. 
6c) can be seen. We conclude that nonenzymatic tech- 
niques are useful for 2-DE analysis of a number of dif- 
ferent tumors. The quality of the resulting gels is com- 
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parable to that obtained using cultured cells icompar- 
the gels in Fig. 2 with those in Fig. 4. 6 and ?). Which of 
these methods will be optimal will, in our experience 
depend on the tumor material. For example, very small 
tumors are preferably extracted by squeezing; on the 
other hand, breast cancers (which are often fibrous) 
yield satisfactory samples using scraping. 

3.23 Purification of cells on percoll gradients 

We considered the possible advantage of separating 
viable cells from dead cells, ervthrocvtes. and debris 
using discontinuous Percoll gradients. Cells collected 
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v.ith these observations (Fig. 8). A number of potential 
and interesting markers, like tropomyosin isoforms. cyto- 
keratins and heat shock proteins, appear to be insensi- 
tive 10 loss of viability during the preparation procedure. 
We have to date made numerous observations of altera- 
tions in the expression of these polypeptides in breast 
cancers and lung cancers. 

Another problem that may occur, irrespective of sample 
preparation techniques used, is admixture of lympho- 
cytes. These cases are easily detectable in smears and it 
may therefore be possible to select lymphocyte specific 
spots as "internal markers" for the 2-D PAGE analysis. 
Studies using this approach are in progress. Many of the 
polypeptides identified are structural (Table 1). Since the 
expression of many of these polypeptides are known to 
vary between normal and malignant cells, the possibility 
to determine their expression simultaneously is 
appealing. In the specific case of breast cancer, altera- 
tions in the expression of intermediate filament proteins 
(cytokeratins) are known to occur during tumor progres- 
sion [23]. Other proteins known to be difTerentially 
expressed between normal cells and transformed cells 
are tropomyosins. numatrin/B23. heat shock proteins 
and PCNA. To this end, we have observed alterations in 
the expression of cytokeratin 8. hsp 90. and non-muscle 
tropomyosin isoform 2 during malignant progression. 
(Okuzawu et *//.. in preparation and Franzcn et al.. in pre- 
paration). 

The method of choice for sample preparation from 
tumor tissues will depend on the properties of the tumor 
material studied. It may be important to use only one 
method when comparing cases within one group, as dif- 
ferences were observed between methods. The advan- 
tages of the nonenzymatic techniques arc (i) that it mini- 
mizes contamination with connective tissue, (ii) that 
problems with contamination of serum proteins are 
avoided, and (iii) that separation of viable and dead cells 
is not necessary. Hereby the revolving power of 2-D 
PAGE is maximized for the analysis of human tumors 
and studies on inter-tumor variations in gene expression 
are facilitated. In addition, the polypeptide patterns ob- 
tained may be more representative for the /'/; vivo tumor 
cell since the use of enzymes and incubations have been 
minimized. 

Hi' would like ro thank Dr. J. /. Garreis. Dr. S. Pattcrsson, 
Dr. 5. A/. Hanash and Dr. J. E. Celts for making sample 
and 2-DE map exchanges possible. This study was sup- 
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Reference points for comparisons of two-dimensional 
maps of proteins from different human cell types 
defined in a pH scale where isoelectric points correlate 
with polypeptide compositions 



A highly reproducible, commercial and nonlinear, wide-range immobilized pH 
gradient (IPG) was used to generate two-dimensional <2-D> gel maps of 
(-'*S)methionine-labeled proteins from noncultured. unfractionated normal 
human epidermal keratinocytes. Forty one proteins, common to most human 
ceil types and recorded in the human keratinocyte 2-D gel protein database 
were identified in the 2-D gel maps and their isoelectric points {pf) were deter- 
mined using narrow-range IPGs. The latter established a pH scale thai 
allowed comparisons between 2-D gel maps generated either with other IPGs 
in the first dimension or with different human protein samples. Of the 41 pro- 
teins identified, a subset of 18 was defined as suitable to evaluate the correla- 
tion between calculated and experimental pi values for polypeptides with 
known composition. The variance calculated for the discrepancies between cal- 
culated and experimental pi values for these proteins was 0.001 pH units. 
Comparison of the values by the Mest for dependent samples (paired test) 
gave a p-level of 0.49. indicating that there is no significant difference between 
the calculated and experimental pi values. The precision of the calculated 
values depended on the buffer capacity of the proteins, and on average, it 
improved with increased buffer capacity. As shown here, the widely available 
information on protein sequences cannot, a priori, be assumed to be sufficient 
for calculating pi values because post-translational modifications, in particular 
A'-terminal blockage, pose a major problem. Of the 36 proteins analyzed in 
this study. 18-20 were found to be ^terminally blocked and of these only 6 
were indicated as such in databases. The probability of A'-terminal blockage 
depended on the nature of the A-terminal group. Twenty six of the proteins 
had either M. S or A as A-terminal amino acids and of these 17-19 were 
blocked. Only 1 in 10 proteins containing other A'-terminal groups were 
blocked. 



1 Introduction 

As compared with carrier ampholyte isoelectric focusing 
(CA-IEF). the application of immobilized pH gradients 
(IPGs* in the first dimension in 2-D gel electrophoresis 
offers improved reproducibility [1] because the nature of 
the pH gradient makes the resulting focusing positions 
insensitive to the focusing time [2] and to the type of 
sample applied (3]. The recently introduced ready-made 
IPG strips [4] seem to be an ideal substitute for the car- 
rier ampholyte gradients, which until now have been the 
most commonly used first dimensions in 2-D gel electro- 
phoresis. The availability of standardized first dimen- 
sions opens the possibility of comparing 2-D gel maps of 
various cell types generated in different laboratories, pro- 
vided that the focusing positions of a number of easily 
recognizable polypeptide spots common to the cell types 
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in question are known. Even though this approach is 
limited to experiments performed with the same standar- 
dized IPG. the flexibility provided by IPGs allows the 
pH gradient to be adjusted to the requirements of a par- 
ticular experiment. 

Exchange and communication of 2-D gel protein data re- 
quires a pH scale that is independent of the particular 
IPG used and by which the results can be described. The 
introduction of carbamylation trains and the relation of 
focusing positions to the spots in these trains repre- 
sented a step forward towards solving the reproducibility 
problem experienced with carrier ampholyte focusing (5). 
Problems associated with the use of carbamylation trains 
were mainly due to lack of temperature control and to 
the use of nonequilibrium focusing conditions. Accord* 
ingly, the pattern variation involved not only the re- 
sulting pH gradients, but also the relative spot positions 
as related to each other and to spots in the carbamyla- 
tion trains. Even though the question of reproducibility 
has. to a large extent, been solved, the carbamylation 
trains are still not ideal as markers because the spots in 
the trains do not represent defined entities but rather a 
large number of differently carbamylated peptides 
having close pi values. As a result, the spots are large 
and poorly defined as compared to the ordinary polypep- 
tide spots in 2-D gel maps. 
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4-6.5. 1 pan Ampholine pH 6-8 and 1 pan Pharmalyte 
pH 8-10.5. Usually, cathodic sample application was 
used and the samples were diluted 2—20 times in a solu- 
tion containing 9.8 m urea. l°o w/v CHAPS. l°o w/v 
DTT and 35 m.M Tris base. For acidic application, the 
Tris-base was substituted with 100 m.M acetic acid. The 
degree of dilution and sample volume (20-100 uL) 
depended on the particular sample and the IPG. and 
whether visualization of the proteins was to be done by 
Coomassie Brilliant Blue or silver staining. With the 
wide-range non-linear IPG. 10-30 ug of total protein 
was loaded for silver staining and 100-200 ug for Coo- 
massie staining. Focusing was done overnight with Vh 
products in the range of 45—60 kVh with 160 mm long 
strips and 50-70 kVh with 180 mm long strips. Solubili- 
zation of polypeptides and blocking of -SH groups prior 
to the second-dimensional run. as well as loading on the 
second-dimensional gel was done as described in (9). 
The slacking gel was omitted and 5-10 mm were left at 
the top of the second-dimensional gel for applying the 
IPG strip. The space was filled with electrode buffer con- 
taining 0.5 °o w/v agarose. Casting, running, staining and 
autoradiography were carried out as described in [15]. 

2.4 Experimental determination of pi values 

The determination of the pA' differences between Immo- 
hilincs pA 4.6. pA' 6.2 and pA* 7.0 necessary for the cali- 
bration of the pH scale at 25 C in 9.8 m urea was done 
as described in [9J with the same narrow-range IPGs. 
The pH scale was defined by selling the pA* value of 
Immobilinc pA 4.6 equal to 4.61 [9] and the determined 
pA* differences cave the pA' values of Immobilines pA'6.2 
and pA "\0. equal to 5.73 and 6.54. respectively. The pA' 
differences found arc in good agreement wiih values de- 
rived from [17] and (8) by extrapolation lo 9.8 m urea 
concentration. As in (9). additional narrow-range recipes 
have been used for determining p/ values. With narrow- 
ranee IPGs extending to pH values higher than the pA' 
value of Immobiline pA' 7 .0. anodic sample applicaiion 
was u?>cd with acetic acid added to the sample solution. 
Otherwise, cathodic sample application was used with 
the same sample buffer as for wide-range IPGs. 

2.5 Protein compositions used for p/ calculations 

Wiih the exception of vimcmin. protein compositions 
are from the Swiss-Prot daiabase (18]. For vimentin. we 
used the data from (19). where the amino acid at posi- 
tion 41 is a D instead of a S. Information in the Swiss- 
Proi daiabase on phosphorylation has been disregarded 
because ii was known from earlier studies (J. E. Celis. 
unpublished results) that the spots in question corre- 
sponded to the unphosphorylated forms of the peptides. 



different substituents on the c-carbon were taken m:o 
account. The calculations of pi values were made \w:h 
the aid of the IPG-maker program (20]. 

2.7 pK values used for pi calculations 

For the carboxyl terminal group and interna! glutamyl 
and aspanyl residues the same pA values were used as in 
[9J. For C-terminal glutamyl and aspanyl residues, sep- 
arate pA" values were derived with the aid of the Tafi 
equations 19. 21). The pA' values of histidyl croups were 
calculated from the pi values of human carbonic anhy- 
drase I as in [9]. For \-terminal glycine a pA value of 
7.50 was used. The pA* shift caused by a substituent on 
the c-carbon was assumed to be identical with the pA 
shift the substituent caused for the amino group in the 
amino acid. i.e. 2.28 pH units were subtracted from the 
pA' values for the amino groups in the amino acids given 
in [22. 23). The approximate pA* value of 9 for the cys- 
tenyl group was taken from [24]. For tyrosyl and arginyl 
groups we used the pA' values for the amino acids [22. 
23). For iysyl groups the effect of high urea concentra- 
tion on amino groups was taken into account and 0.5 pH 
units were subtracted from the amino acid pA* value. 
These last three pA' values are far from the pH range 
under study and the results found would have been the 
same if Iysyl and arginyl groups were assumed to be 
fully ionized while the ionization of tyrosyl groups were 
neglected. A complete list of the pA" values used is given 
in Table 1. 



Table t. pA' Values used for the iomzable groups in peptides 
9.8 m urea. 25 °C 
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4.55 
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2.6 Calculation of pi values 

For the p/ calculations it was assumed that the same pA' 
value could be used for an amino acid residue in all 
polypeptides and in all positions in the peptide except 
for .V- or C-terminally placed amino acids. For the pA' 
values of the .V-terminal amino groups the effect of the 



2.8 Statistical analysis 

Statistical comparisons of the experimental and calcu- 
lated pi values were done on an Apple Macintosh Ilsi 
using the statistical package Statistica/Mac, release 3.0b 
(from StatSoft Inc., Tulsa, Oklahoma). Calculated and 
experimental pi values were compared by the /-test for 
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aggregates of acidic and basic keratins. An increase in 
urea concentration to 9 m or more eliminated these 
streaks; apart from this effect, no other major changes in 
the focusing positions were observed. In Fig. 1 we have 
indicated the positions of 41 known proteins from the 
human keratinocyte 2-D gel database that are most 
likely common to most human cell types. The choice 
was made because these proteins are easy to identify 
with certainty. With the exception of stratifin (spot 2), 
involucrin (spot 4) and keratin 14 (spot 15). which are all 
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epithelial markers, these proteins are also presen: ir. 
human fibroblasts (Fig. 3) and lymphocytes (results no: 
shown), and therefore can be used as landmarks for com- 
paring 2-D gel maps derived from different cell types. In 
Table 2 the 41 proteins are listed together with the:: 
sample spot numbers (SSP) in the human keratinocyte 
protein database and p/ values determined in 2-D gel 
maps generated with narrow-range IPGs in the first 
dimension. 




rTTpF «^ ¥Cl r Pr0t< : in m3P 01 ^ s ' mcih *on'ne-IJbelcd proteins from noncultured. unfractionated normal human keratinocvies focused wiih 
CA-jef m the firsi d.mens.on. The posuion of the 41 protetns analyzed in this study is indicated. 
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3.2 Comparison between the determined and calculated 
p/ values for human keratinocyte proteins 

Thiny six of the 41 proteins listed in Table 2 are found 
in ihe Suiss-Prot database. Contrary to the plasma and 
liver proteins used in [9]. the p/ calcuations on the pro- 
teins used in this study posed some problems that 
reflected the way in which they were characterized. The 



proteins used by Bjellqvist ei aL [9) were either v?-\ 
abundant and well-characterized plasma proteins or the> 
were identified by A-terminaJ sequencing and. therefore, 
the nature of the A-terminais (acetylated or non-acet\ - 
lated) was in both cases known. The proteins used in 
this study have all been characterized by interna! 
sequencing [7] and it is known thai A-ierminal acetyia- 
tion occurs with high frequency in eukaryotes. 

r3 




27' 



T 

14 



7 



31 . 
\ V 19 * 



-16 



itetropnowts 1994. J*. 529-539 



Reference pomu for companion* of :*D $c. -.^r? 



•alues derived from the data on plasma and liver pro- 
:eins in (9] (Table 3). the present daia are found to result 
:n larger variances for the values of both pi discrepancies 
and calculated charge at the experimental p/ value when 
no information on posttranslational modification is 
;aken into consideration. Correction for possible .V-acety- 
iation of 12 polypeptides with M. S and A as .V-terminal 
results in a smaller variance of pi discrepancies, al- 
though not significantly different from values derived 
from [9], whereas the variance of the calculated charge at 
:he experimental pi value is significantly higher. For the 
18 selected proteins the variance for the pi discrepancies 
s significantly smaller than for the data in (9); however. 
:he corresponding value for calculated charge at the 
rxperimenta! pi value does not improve to the same 
extent. This, we believe, reflects another difference 
Detween the two sets of proteins used for the calcula- 
:ions. Based on spot distributions in 2-D gel maps, the 
;et of proteins used here has a molecular weight distri- 
oution that is more representative of the patterns ob- 
served in mammalian cells. In the study by Bjellqvist 
?tal. [9) most of the high molecular weight plasma pro- 
eins had to be excluded due to their unknown content 
)f sialic acid which made the proteins analyzed in this 
;tudy heavily biased towards low molecular weight pro- 
eins. The buffer capacity of proteins normally increases 
viih the protein's molecular weight, and the average 
jufTer capacity of the presently selected proteins with 
issumed known .Y-ierminals is 18 charge units/pH unit, 
vhile the corresponding value for the proteins used in 
9) is only 9 charge uniis/pH unit. High buffer capacity 
;an be expected to improve the agreement between cal- 
:ulated and experimental pi values. Inspection of the 
Jata presented in Table 2 for the polypeptides with 
issumed known A'-ierminals verifies the importance of 
he buffer capacity. For 8 polypeptides having buffer 
rapacities higher than 15 charge units/pH unit, the calcu- 
aiions in all cases yielded pi discrepancies with absolute 
alues of less than 0.02 pH units. The largest discre- 
lancy. 0.00 pH units, was observed for annexin II and 
tathmin. proteins which have lou buffer capacity: 0.9 



and 6.6 charge units/pH unit, respectively. The proba- 
bility that the focusing position of a protein with known 
composition will fall within a certain distance from the 
calculated pi value therefore cannot be predicted by the 
variance alone. The buffer capacity of the specific protein 
must be taken into consideration as well. As indicated 
by the decrease of the variance of calculated charges at 
the experimental pi value for the selected proteins, the 
observed improvement can not solely be due to the 
higher buffer capacity of the keratinocyte proteins. The 
two studies relate to different experimental conditions. 
Good agreement between experimental and calculated 
pi values implies that the proteins are defolded and a 
factor that may contribute to the observed improvement 
is a more complete defolding of proteins caused by the 
higher temperature and urea concentration used in this 
study. 

The data indicated that the precision with which pi 
values can be predicted for polypeptides with high buffer 
capacity is better than the precision with which experi- 
mental pi values can be determined. If the pH is defined 
through the pA' values of the immobilized groups in the 
IPG containing gel. the precision of the experimentally 
calculated data will depend on the pH difference 
between the pi and the pA' value of the immobilized 
group with the closest pA'. For the present study this will 
give pi determinations with a precision varying in the 
range of ± 0.02-0.05 pH units [9]. The good agreement 
observed between the calculated and experimental pi 
values is due to the fact that errors are mainly system- 
atic and. as discussed in [9], they will largely be cancelled 
out in the calculations. A pH scale defined through the 
presently determined pi values will not necessarily 
reflect the variation of the hydrogen ion activity during 
the focusing step in an optimal way. but it still allows 
precise predictions of focusing positions for polypeptides 
with known compositions, including information on 
posttranslational modifications. Calculated net charge at 
the experimentally found isoelectric point defined in this, 
scale will serve as a tool to verify that the polypeptide 



abli- 3. Mean values and variances lor the dillercncc t experimental p/-calculated pf) in pH units and calculated charges at the experimental pi 

\aiucs. rcspcciivel\ 

Plasma jnd liver Keratinocyte proteins 

proteins (9.8 m urea. 25 V C) 

(8 m urea. I0°C) 



All peptides All .peptides alter Known \-ierminal 

correction for configuration (or 

A'-aceiylaiion very likely configuration) 



*umb;; 01 proteins 


29 




36 


36 




18 




Mean 


Variance 


Mean Variance 


Mean Variance 


Mean 


Variance 


Ixpenmemal pi- 


-0.011 


0.005 


0.072 0.017 


0.019 0.003 


0.005 


0.001 


jicuiatcC pi 














•vaiue ip/ discrepancy)-" 


1 




3.4 


1.67 




5 


•level ip/ discrepancy j s ' 


0.5 




0.0005 


0.0721 




0.0004 


."akuiaied charge at the 


-0.070 


0.22? 


0.32) 0.871 


0.009 0.444 


-0.014 


0.109 


xoenmental pi value 














•\alue (calculated charge 


1 




3.8 


1.96 




2.08 


: the experimental pi value r" 














•level (calculated charge 


0.5 




0.0002 


0.0338 




0.0536 


: the experimental pi value » 0 ' 














i Comparison to the data in |9|. f «= Srls, 1 . 
I />tAu : . r : » ^ Avaluei. where r. and i ; are 


where 5f is the larger of the two 
the degrees of freedom for *, and 


variances 
ij. respectively 







£iectmpnorrsis 1994. //. 529-539 

have been difficult and work-iniensive 10 determine. 
Recent developments in the field of mass spectrometry 
are fast changing this situation and within the next years 
we can expect a surge in reliable data in this area. While 
awaiting this development, verification of correctness 
and completeness of available information on polypep- 
tide composition can be provided by experimental pi 
values in a pH scale based on the p/ values determined 
in this study. So far. our data cover the pH range below 
pH = ".5. The basic pH range covered by NEPHGE as 
first dimension will be covered in forthcoming work. 

Received December 29. 1995 
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composition used in the calculation is correct and com- 
plete. Exceptions to this are proteins such as invoiucrin 
and heat shock protein 90 that have very high buffer 
capacities. Introduction of an extra charge unit into 
these proteins will only result in p/ shifts falling in the 
range -of 0.01-0.02 pH units and the effect is that the 
quality of the pH definition - the precision by which pA' 
values used in the calculations are given and the preci- 
sion of experimental p/ values in these cases — will limit 
the possibilities to verify polypeptide compostion based 
on the experimental p/ value. 

Statistical comparison of experimental and calculated pi 
values was done using the /-test for dependent samples 
and normality of the discrepancies was estimated by 
probability plots. For the 36 proteins, the /Hevel is 
0.0021. indicating that a result like this is unlikely to 
be a chance effect and must be assumed to represent a 
real difference. After correction for the most likch 
A-terminal configuration, the HeveJ is 0.043 and cannot 
be accepted as representing the same population since 
the Hevel is less than 0.05 - the traditional Himit of 
statistical significance. For the 18 proteins with a known 
or very likely A'-terminal configuration the Mesi gave a 
p-level of 0.49, which verifies that the experimental and 
calculated p/ values are not significantly different. 

Besides showing that p/ values for denatured proteins 
with known compositions can be calculated with a high 
degree of precision from average pA' values, the results 
also provide strong support for the notion that 
A'-terminal blockage heavily depends on the nature of 
the A'-terminal groups (26]. The results seem to indicate 
that with A'-terminals other than M. S and A. only a few 
proteins have blocked A'-terminals (1 out of 10 proteins 
in the present study), while it can be inferred from the 
data presented in Table 2 that a majority of the proteins 
with M. S and A as A'-terminal are blocked. After correc- 
tion for the effect of suspected A'-terminal blockage 
there is only one protein (nucleolar protein B23) out of 
the 36 used in this study, which, in spite of a high buffer 
capacity, has a marked difference of 0.11 pH units 
between predicted and determined p/ values (Fig. 4B); 
this corresponds to 3 charge units due to the high buffer 
capacity of this protein. This discrepancy in p/ prediction 
and calculation of net charge at the p/ is probably not 
due to deficiencies in the database information but 
instead reflects a shortcoming of the model used for p/ 
calculations. Nucleolar protein B23 contains a domain 
extremely rich in aspartic and glutamic acid residues 
(Table 4), in which 26 out of 28 amino acid residues 
from position 161 to 188 are either a D or an E. A calcu- 
lation based on the use of average pA' values unin- 
fluenced by the charged neighboring amino acid resi- 
dues cannot be expected to correctly describe the pi 
value with almost half of the acidic groups packed 



Table 4. Amino acid sequence of nucleolar phosphoprotein 023 
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iogcther into a highly negatively charged reevr. Th> 
limitation caused by calculations based on averj : - r.\ 
values does not severely limit the usefulness ,v 
approach since a search throuch Swiss-Prot snow < *.r._: 
this type of D/E-rich motif is uncommon, and ;ne ev>- 
tence of a highly charged region is immediately arparerv. 
upon inspection of the ammo acid sequence. 

The quality of the information available in databases, 
especially concerning posttranslational modifications, is 
a major problem when the data is to be used for p/ pre- 
dictions. The Hevel of 0.043 found for all 36 proteins 
after correction for A'-acetylation. shows that this prob- 
lem is not only limited to .V-terminal blockade and the 
very good agreement found for the eighteen jum pep- 
tides, with assuming!}* correctly described A-termm.il 
(Fig. 4C). must be regarded as an exception from this 
point of view. A'-Terminal blockage is generally the mam 
problem in relation to pi predictions for eukaryotic pro- 
teins. Of the 36 keratinocyte proteins analyzed. IS— 20 
are suspected to be A'-terminally blocked <t> proteins blo- 
cked according to Swiss-Prot. 12 proteins with M, S or A 
as A'-terminal and assumingly blocked based on the cal- 
culated charge, and two proteins, invoiucrin and 
nucleolar protein B23. with M as A-terminal for which 
the data does not allow any conclusion). This is in rea- 
sonable agreement with the conclusions based on the 
A'-terminal sequencing data derived in connection with 
2-D gel electrophoresis. .V-terminal blockage can be sus- 
pected for 17-19 of the 26 proteins with M. S or A as 
A'-terminal, while only 1 in 10 proteins with other 
A-terminal groups are blocked. The information that the 
frequency of A'-terminal blockage is strongly related to 
the nature of the A'-terminal group will be of some help 
in connection with pi predictions based on database 
information. However, without information from other 
sources, an uncertainty will always remain as to whether 
the A'-terminal charge should be included in the p/ calcu- 
lation. 



4 Concluding remarks 

The data presented here lays the foundation for com- 
paring 2-D gel protein maps of different cell types gener- 
ated with nonlinear, wide-range IPGs in the first dimen- 
sion. The focusing positions of 41 polypeptides common 
to mosi human cell types have been described in a pH 
scale that allows focusing positions to be predicted with 
a high degree of accuracy, provided that the composition 
of the polypeptides are known and that information on 
posttranslational modifications are available. For poly- 
peptides with a very high buffer capacity, the limiting 
factor is the precision with which experimental pH 
values can be determined rather than the precision of 
the calculations. Possible deficiencies in the pH scale 
description of the variation of the hydrogen ion activity 
has. at least at the present state, no consequences for its 
practical use. The major limitation in connection with 
predictions of focusing positions from polypeptide com- 
positions is the quality of existing data on protein com* 
positions, especially concerning posttranslational modifi- 
cations. Amino acid sequences have been reasonably 
easy to obtain, while posttranslational modifications 
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According 10 Brown and Robert [25]. proteins with acety- 
lated .V-terminals correspond in weight to approximately 
80% of the soluble protein in ascites cells. Based on 
results from A-terminal sequencing, at least 40% of the 
spots in the human liver protein 2-D gel map appear to 
be blocked [3]. The corresponding number, derived from 
107 spots in the 2-D gel map of human T-lymphocyte 
proteins, falls between 60 and 65% (J. Strahier. personal 
communication). Information concerning A-terminal 
blockage is not normally available, and in the Swiss-Prot 
database only 6 of the 36 keratinocyte proteins are speci- 
fied as A-ierminally blocked. We have, within the present 
material, defined 18 proteins for which the A-ierminals 
are very likely to be correctly described. Six of these pro- 
teins are listed in the Swiss-Prot database as ;V-termi- 
nally blocked, four represent proteins which appear in 
the human liver 2-D gel map and have been A-termi- 
nally sequenced as liver proteins [3] and the remaining 
eight have A'-terminal groups other than M. S and A. i.e. 
V-terminals for which A'-aceiylation is uncommon [261. 
In Figs. 4A. B. C and D p/ values calculated from Swiss 
Prot database information are plotted against the experi- 



mentally determined p/ values for all the kerai:ntv>:j 
proteins listed in Table 2 and for the IS seiectec pro- 
teins, as well as for the plasma and liver proiem> 
from [9] valid for \0°C) m . 

The calculations show that without knowledge of ihe 
status of the .V-terminal group, precise predictions of ?l 
values for eukaryotic proteins cannot be achieved based 
on the information available in Swiss-Prot and similar 
databases. However, for proteins where the .V-terminal 
status is known, we find good correlation between pre- 
dicted and experimental p/ values. When the variance of 
the p/ discrepancies and the variance of calculated 
charges at the experimental p/ values derived from the 
present data set are compared with the corresponding 



• There are lour plots: (A) the 5e polypeptides from normjl human 
kerattnocytes mo corrections). (Bl the 3o pol> peptides from Fic.. -\ 
where p/ values have been recalculated lor 12 polypeptides with M. 
S and A as ^terminally assumed blocked, based on calculated 
charge. (C) the 18 selected polypeptides with information on the 
A-terminal configuration, and (D) plasma and liver proteins. 



Finure 4. Calculated vs. experimental p/ values. Lines are filled using the least squares' criterion. (A) 36 polypeptides from normal human keratt* 
nocytes (no corrections). (B) 36 polypeptides from Fig. 4A (including ihe 18 marker polypeptides! where p/ values have been recalculated 
assuming A'-terminal blockage: x indicates recalculated p/ values: nucleolar protein B23 is indicated with an arrow. <C> 18 polypeptides with infor- 
mation on v.terminal configuration and (D> plasma and liver proteins. 



532 



B Bjeilqvtii et at. 



correlated samples (paired Mest). The normality of p/ 
differences was estimated graphically by probability 
plots. The variances of the data presented here and the 
similar data on plasma and liver proteins in (9J were 
compared by the F-test. 

3 Results and discussion 

3.1 Identification of polypeptides and pi determinations 

The 2-D gel maps of ["Sjmethionine-labeled proteins 
from noncultured. unfractionated normal human kerati- 
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Neidhardt etal. (6) defined the pH gradient in 2-D gei 
experiments by pi markers whose p/ values were calcu- 
lated from the amino acid composition. Focusing posi- 
tions of other polypeptides could be predicted from their 
composition but the pA' values needed for the pi calcula- 
tions were unknown. Various groups employing this 
approach do not use the same pK values (6. 7] and there- 
fore, the pi values derived in this way cannot be 
expected to describe the variation of the hydrogen ion 
activity. In spite of this fact, it is still possible to make 
approximate predictions of focusing positions because 
the pK values used to define the pH gradient are also 
used to calculate pi values and to predict the focusing 
positions. Errors in pK assignments are therefore com- 
pensated. A pH scale which corretly reflects the variation 
in hydrogen ion activity during focusing should improve 
the precision of the predictions, but this has never been 
implemented with CA-IEF focusing as a first dimension 
in 2-D gel electrophoresis. The main reason for this are 
the problems associated with pH measurements in 
focused gels containing high concentrations of urea. 

IPGs can be described from the concentration variation 
of the immobilized groups, provided that the pK values 
of these groups are known for the conditions prevailing 
during focusing. To avoid measurements on gels, Gia- 
nazza etal. [8] suggested the use of pK values derived by 
addition of determined pA' shifts. Recently, direct deter- 
minations of pK differences between immobilized 
groups in IPGs were made by determining pl-pK values 
in overlapping narrow- range IPGs [9, 10] and the results 
verified the applicability of the Gianazza approach. A 
description of the focusing results in a pH scale, which 
correctly describes the variation of the hydrogen ion 
activity for the focusing conditions used, not only allows 
the comparison of 2-D gel maps generated with different 
IPGs, but also opens the possibility for correlating the 
focusing position of a polypeptide with its composition 
[9). Experiments by Bjellqvist etal. [9. 10) have implied 
that pH scales showing good correlation between calcu- 
lated and experimental pi values can be derived for any 
of the conditions commonly used for focusing in connec- 
tion with 2-D gel electrophoresis. These pH scales are 
then defined through the pA' values of the immobilized 
groups in the IPG containing gel. To be useful for inter- 
Faboratory comparisons, however, the pH scale has to be 
defined through pi values of easily recognizable spots 
present in the 2-D gel map. So far, pi determinations in 
a useful pH scale, combined with determinations of pK 
values needed for pi calculations, have only been made 
for the pH range 4.5-6.5 at 10°C (9). CA-IEF focusing as 
described by OTarrell [11] does not control the tempera- 
ture of the first dimension, which can be expected to be 
slightly above room temperature. With IPGs, the temper- 
ature commonly used is about 20°C [4, 12] or 25 °C [13] 
and this is a critical parameter that needs to be con- 
trolled [14). 

The present work was designed to compare 2-D gel maps 
of different cell types in a laboratory applying both 
CA-IEF and IPG focusing at a common temperature. To 
this end we have generated 2-D gel maps of proteins 
from noncultured, unfractionated normal human epi- 
dermal keraiinocytes with IPG in the first dimension 



and a focusing temperature of 25C. We ha\e usee jonv 
merciai nonlinear, wide-range IPG strips which zwt Z-D 
gel maps that are closely similar to the ones resuiunt: 
with the CA-IEF technique used to establish the human 
keratinocyte database [15]. As an initial step towards 
inierlaboratory comparisons of results obtained with the 
nonlinear gradient as a first dimension we report here 
on the focusing positions of 41 known proteins that are 
common to most human cell types. The pH range 
covered corresponds to the range in classical CA-IEF 
2-D gel electrophoresis and in order to use these pro- 
teins as internal standards for comparing 2-D gel maps 
generated with other IPGs we determined their p/ values 
with narrow-range IPGs in the first dimension. We have 
compared the calculated versus experimental pi values 
and show that it is necessary to have further information 
(absence or presence and nature of posttranslational 
modifications), in addition to amino acid composition to 
be able to calculate pi values that correspond to the 
actual experimental values. The pA' values used for the 
calculations are provided and the usefulness of pi predic- 
tion in relation to database information is discussed. 
Furthermore, we comment on the possibility of using 
experimentally determined pi values to verify the avail- 
able database information on polypeptide composition. 



2 Materials and methods 

2.1 Apparatus and chemicals 

Equipment for isoelectric focusing and horizontal SDS 
electrophoresis (Muliiphor* II electrophoresis chamber, 
Immobiline x strip tray. Multidrive XL programmable 
power supply. Macrodrive power supply and Multitemp* 
II) was from Pharmacia LKB Biotechnology AB 
(Uppsala. Sweden). Vertical second-dimensional gels 
were run in the home-made equipment described in [15], 
The IPG strips with the wide-range nonlinear pH gra- 
dient were either Immobiiine DryStrip v pH 3—10 NL, 
180 mm or alternatively 160 mm long IPG strips with a 
corresponding pH gradient. In both cases the IPG strips 
were delivered by Pharmacia LKB. Immobiiine. Pharma- 
lyte. Amphoiine. GelBond as well as PAG film and the 
ready-made horizontal SDS gels (ExcelGeP XL SDS 
12—14) were also from Pharmacia LKB. Purified proteins 
and peptides were from Sigma (St. Louis. MO). 

2.2 Sample preparation 

Preparation and labeling of unfractionated keratinocytes 
as well as fibroblasts have been described in (16). Cells 
were lysed in a solution containing 9.8 m urea, 2% w/v 
NP-40. 100 mM DTT and 2% v/v Ampholine pH 7-9. 

2.3 2-D gel electrophoresis 

First-dimensional focusing was performed according to 
Gorg etal. (2) with some minor modifications, as de- 
scribed in [9]. Rehydration of the IPG strips was made 
in a solution containing 9.8 m urea, 2% w/v CHAPS, 10 
mM DTT and 2% v/v carrier ampholyte mixture. The car- 
rier ampholyte mixture consisted of 2 pans Pharmalyte 
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f ™™ lhe interphase showed a viabilitv of more than 
90% as judged by trypan blue exclusion test. However it 
as found that the yield of viable cells decreased drama- 
tically if the tissue resection was not immediately pro- 
cessed. To study the effect of lysis of cells during the pre- 
paration procedure. 2-DE maps were prepared from 
nonenzymatically extracted ceils of case LB collected 
from the top fraction (nonviable. Fig. 7a) and interphase 
fraction (viable. Fig. 7b). These 2-DE maps were 
compared with corresponding fractions (nonviable Fig 
7c, and viable. Fig. 7d) of enzymaticallv extracted cells 
One clear disadvantage of the enzymatic technique was 
that when loss of cell viability occurred during prepara- 
tion, a dramatic loss of high molecular weight polypep- 
tides was observed (Fig. 7c). This was probably due to 
degradation of intracellular proteins. However, nonenzy- 
matic preparations showed fewer differences between 
viable and nonviable cells: The most pronounced altera- 
tion was a decrease of a group of mucine related pro- 
teins (Fig. 7b). We conclude, therefore, that disconti- 
nuous Percoll gradient is necessary after enzvmatic 
extraction of cells, but can be omitted from the nonenzy- 
matical tumor sample preparation procedure. 

We used the MDA-231 cell line to study the effects of 
cell lysis and leakage of cytosolic polypeptides during 
sample preparation. Remarkablv, after 30. 50. 80 and 140 
mm of incubation in PBS/PIH at 0"C no significant 
changes were observed in the 2-DE pattern (not shown) 
Although loss of cell viability mav not result in protein 
degradation when cells are incubated in the presence of 
protease inhibitors, loss of cytosolic proteins would be 
expected during pelleting of cells. We monitored the loss 
ol lactate dehydrogenase (LDH) activity into the super- 
natant during incubation in PBS of MDA-231 and MCF- 
7 breast cancer cells at 20 C. In both cases, loss of via- 
bility was paralleled by release of LDH from the cells 
(Fig. 8). Alter 5 h. 70<>o of the MCF-7 cells, but onlv 30«N. 
of the MDA-231 cells were dead (not shown). 
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tion time of the mammary carcinoma cell lines MDA-231 and MCF-7 
during incubation in PBS at 20X. 



These data indicate the impact of a rapid preparation 
procedure, at low temperature, of fresh tumor samples 
Experiments have also been performed using onlv 
1.07 g/mL Percoll (Fig. 6c and Fig. 1. left test tube) in 
order to remove erythrocytes. One clear advantage with 
this procedure, which today is routinelv utilized, is a 
higher yield of viable cells, probably due to decreased 
sample preparation time. 

4 Discussion 

We describe procedures for sample preparation from 
solid tumors for 2-DE. 2-DE maps could be derived 
from solid tumors which were similar in qualitv to those 
obtained from cultured cells. Compared to' methods 
using frozen material, the resolving power of the 2-DE 
technique is increased, allowing examination of a laree 
number of polypeptides from tumors of different malig- 
nancies. Other investigators [12.22) have used samples 
from frozen tumors to derive 2-DE maps. We have previ- 
ously described disadvantages encountered using frozen 
tumor samples including variations in contaminating pro- 
teins between different samples [3]. The methods de- 
scribed here are based on the preparation of cells from 
tumors without enzymatic digestion. The enzvmatic step 
could be avoided since malignant cells usuallv grow us 
solid masses which are not strongly attached to the 
matrix. Furthermore, we found that omitting the enzy- 
matic digestion alleviated the necessitv of purifying 
viable tumor cells on Percoll gradients. This was in sharp 
contrast to cnzymatically. treated samples, where loss of 
viability leads to loss of high molecular weight proteins 
(Fig. 7c). 

At least in the case of lung cancer, viable and nonviable 
cells showed small differences in respect to 2-DE maps. 
Presumably, protease inhibitors penetrate cells and 
inhibit proteolysis. In model experiments, we observed 
leakage of cytosolic protein (LDH) from the cells in 
parallel to loss of viability. Apparently, however, only a 
limited decrease of the level of low molecular weight 
cytosolic polypeptides was detected using silver staining 
combined with visual inspection. We have found that 
although some tumors are well suited for the prepara- 
tion procedure described, others are not. In general, 
good results were obtained using tumors of the lung, 
breast, corpus and lymphomas. In contrast, cells from* 
thyroid adenomas and hypernephroma showed poor via- 
bility. We were in these cases unable to separate nonvi- 
able cells from viable cells, and we can therefore not 
evaluate the consequence of the loss of viability on 
2-DE patterns, apart from a loss of some low molecular 
weight cytosolic polypeptides. 

Highly differentiated tumors may show lower viability as 
compared with poorly differentiated tumors (Dr. Farkas 
Vanky, personal communication). A number of samples 
from thyroid tumors were prepared for 2-DE but most 
cases showed poor viability. We believe that special care 
is needed during preparation of generally highly differen- 
tiated tumor groups. The difference between loss of via- 
bility/leakage of LDH of the more differentiated MCF-7 
cells and the less differentiated MDA-231 cells is in line 
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3.2.2 Comparison of different methods for preparing 
cells from fresh tumor tissue 

Samples were prepared from breast and lung carcinomas 
using either an enzymatic treatment with collagenase/ 
elastase or using nonenzymatic preparations (Fig. 4). A 
number of differences in the protein profiles were ob- 
served in the resulting 2-DE gels, some of which are 
indicated in Figs. 4a and b. These differences include 
both increases and decreases in spot intensity. These dif- 
ferences may result from degradation of high molecular 
weight polypeptides during enzymatic treatment, in- 
creased solubilization of polypeptides, or may have other 
causes. For many tumors, it was only possible to obtain 



small amounts of material since ihey were reserved for 
other examinations. In these cases, samples could be pre- 
pared for 2-DE using either needle aspiration or 
scraping. Figure 5a shows a 2-DE gel prepared from 
squamous lung carcinoma (LS> cells collected by needle 
aspiration and Fig. 5b shows a gel prepared from the 
same tumor by scraping. In this case, a number of differ- 
ences were recorded between the two procedures, some 
of which are arrowed in Fig. 5. Samples obtained from 
other tumors (breast and lung) generally showed fewer 
differences between these two methods of cell sampling 
(not shown). These data show that different nonenzy- 
matic extraction procedures may yield different polypep- 
tide patterns. However, the number of spots with a large 
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Figure „\ 2-DE analysis of a case of lung cancer (LS). Comparison of 2-DE gel quality and delected spots (arrow heads and circles i between 
(A) aspirated (needle aspiration) and (B> scraped preparations from fresh tissue. 




Figure 6. 2-DE analysis of three other types of tumors. (A) hypernephroma. (B» an adenoma of the thyroid and (C> corpus cancer, using the 
nonenzymatic preparation technique. Arrowheads and circles indicate some cytosolic polypeptides. 
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2.4.5 Preparation or frozen tumor tissue 

The technique has been described previously [3.12]. 
Briefly, the sample is moaned frozen to a fine powder, 
homogenized, lyophilized and solubilized in sample 
buffer. 

2.4.6 Control of representative 

The tumors were examined routinely by experienced 
pathologists and smears or imprints from the samples 
were also assessed for cytometric DNA content by 
microspectrophotometry. 

2.5 2-D PAGE 

2-D PAGE was performed as described [8,10] except for 
the following details. The glass tubes for IEF, 1.2 X 200 
mm. contained 2.0% Resolyte, pH 4— 8 (BDH) and were 
cast to a height of 180 mm. A stock solution of acryl- 
amide (Serva) and A'.A"-methylenebisacrylamide (16.7:1 
for IEF and 37.5:1 for the second dimension) was deio- 
nized by mixing with 5% w/v Duolite MB 5313 mixed- 
resin ion exchanger (BDH) for 30 min, filtered (with a 
0.22 |im nitrocellulose filter) and stored at — 70°C. 
A',A"'-Methylenebisacrylamide. A',A".A\N'-tetramethyleth- 
ylenediamine (TEMED) and ammonium persulfate were 
purchased from Bio-Rad. IEF tubes were prefocused at 
200 V in 60 min. To each tube a sample corresponding to 
20—40 ng protein was applied and focused for 14.5 h at 
800 V and finally 1.0 h at 1000 V using a Protean II cell 
(Bio-Rad) and Model 1000/500 Power Supply (Bio-Rad). 
The tube gels were finally extruded into 1.25 mL equili- 
bration buffer, containing 60 mM Tris, pH 6.8 (2% SDS, 
100 mM dithiothreitol and 10% glycerol), frozen on dry 
ice and stored at -70°C The second dimension (1.0 X 
180 X 90 mm) of the acrylami.de concentration was 10% 



T. and the gel contained 376 m\i Tris. pH 8.S. and 0.1 
SDS. IEF gels were applied on top of the slab gel. seaieij 
with 0.5% agarose containing electrophoresis running 
butter (60 mM Tris-base. 0.2 m glycine and 0.1 i v SDS) 
and electrophoresed with 10-11 mA per gel (constant 
current) at + 10°C. Six gels were run together in a Pro- 
tean II xi 2-D Multi-Cell (Bio-Rad). Proteins were visual- 
ized by silver staining and photographed with the acidic 
side to the left [13.14]. 

2.6 Identification of polypeptides 

Vimeniin and vimentin-derived polypeptides were identi- 
fied by extraction of an MDA-231 cell lysate with 0.6 m 
KCl/0.5% NP-40 [15]. Tropomyosins were exciracted 
from MDA-231 and WI38 cell lysates [16). and cytokera- 
tins were extracted from MDA-231 and MCF-7 cell 
lysates [17]. The patterns were compared with published 
maps [19 — 21 J. Proliferating cell nuclear antigen (PCNA) 
was identified by immunoblotiing (PC10 mAB, Da'ko- 
patt) using a semidry system (Multiphor 11 Nova Blot. 
Pharmacia-LKB Biotechnology AB) and enhanced che- 
moluminescence (ECL) detection (Amcrsham). 

3 Results 

3.1 2— DE of samples prepared from normal and 
tumorigenic cultured cells 

The object of this study was to develop methods for pre- 
paration of 2-DE maps from human tumor tissue which 
have the same high resolution as those obtained from 
cultured cells. Shown in Fig. 2 are high resolution 2-DE 
gels prepared from cultured cells and one leukemia: 
SV40 transformed embryonal rat fibroblasts WT2 (Fig. 
2a); human MDA-231 breast carcinoma cells (Fig. 2b); 
human WI38 fibroblasts (Fig. 2c) and human pre B-ALL 
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Figure 3. 2-DE analysis of a case of lung adenocarcinoma (LA). Comparison of 2-DE gel quality between <Ai frozen and (B> fresh (needle 
aspiration) tissue preparation. 
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roma. a tumor of the kidney CKH). and finally one case 
of poorly differentiated corpus carcinoma (CP). 

23 Preparation of cultured cells 

The cell monolayers were washed twice in phosphate 
bufTered saline (PBS) and then scraped off in ice-cold 
PBS including protease inhibitors (PIH). phenylmethyl- 
sulfonyl fluoride (PMSF) 0.2 mM and 0.83 mM benzami- 
dine pelleted at 660 X 3 min (+4°C) and washed one 
lime before final centrifugation at 2700 X g. 5 min. The 
wet weight of the cell pellet was recorded and the cells 
were stored at — 80°C until further processing. 

2.4 Preparation of tumor tissue samples 

2.4.1 Genera) remarks 

Macroscopically representative and non-necrotic tumor 
tissues were selected within 20 min after reseciion. 
Parallel samples were routinely prepared for cytology. 
The samples were processed as rapidly as possible on ice 
or at +4°C and in the presence of PIH. Cells were 
stained with DifTQuick (Baxter) and usually examined at 
three different occasions during the preparation proce- 
dure: (i) cytology sample, (ii) extracted cells and (iii) 
cells after percoll gradient centrifugation. 

2.4.2 Specimen acquisition 

The strategy of sample preparation is shown in Fig. 1. 
Tumor tissue cell samples were usually obtained by fine 
needle aspiration (NA) using a 0.7 mm needle. The 
syringe was filled with 1—2 mL of ice-cold culture med- 
ium/PIH. We found that if a tumor appeared to be very 
fibrous it is difficult to extract enough cells for 2-DE 
analysis. In these cases, two alternative techniques were 
examined, (i) The tumor was cut in the middle and the 
fresh surface scraped (SC) by a scalpel. The cell-rich 
material was then transferred to ice-cold culture 
medium (L15 with 5% fetal calf serum)/PIH. (ii) A part 
of the tumor sample was placed in culture medium on 
ice for further processing at the laboratory in the fol- 
lowing way: the material was cut into very small frag- 
ments on a pre-cooled dissection plate and transferred 
to a small glass chamber with a 0.7 mm metal net 5 mm 
above the bottom of the chamber. Medium /PIH was 
added to cover the sample (8 mL) which was gently 
squeezed (SQ) towards the net in order to release and 
wash out cells. NA and SC were also compared with an 
enzymatic extraction (EE) procedure described previ- 
ously [5): Briefly, thin slices of tissue were incubated 
with collagenase (1 mg/mL) and elastase (2 mg/mL) in 
medium for 1 h at 37°C. Extracted cells from every 
sample were then subjected to percoll gradient centrifu- 
gation (Section 3.2.3). 



2.4.3 Separation of cells by Percoll gradient 
centrifugation 

The cell suspension was filtered through two nylon mesh 
fillers, (i) 250 \im and (ii) 100 urn and then centrifuged 



at 660 X « for 3 min. The cell pellet was resuspended 
carefully in medium, using a syringe and loaded onto , : 
two-step discontinuous Percoll/PBS gradient. 20.4 
(density = 1.03 g/mL) and 54. 7 % (density - l.o* g/mLi. 
and centrifuged at 1000 X .v for 15 min. In this system, 
dead cells stay on the top. viable cells sediment to the 
interphase and erythrocytes sediment to the bottom. The 
viability of cells in the top fraction and interphase was 
checked by the trypan blue exclusion test. The inter- 
phase cell layer (> % 0 o viability) was collected and 
washed one time in a large volume PBS/P1H (centri- 
fuged at 800 X $ for 3 min). Finally, the ceils were resus- 
pended in 1.4 mL PBS and pelleted at 2700 X «: for 5 
min. The wet weight (WW) was recorded and the pellet 
was then stored at -80 C. 

2.4.4 Final preparation of cells for 2-D PAGE analysis 

From this point, cultured cell samples were treated 
in the same way as tumor cell samples: Each cell pellet 
was thawed on ice and rcsuspendcd in 1.89 tiL mQ water 
per mg WW (= 1.89 x WW) uL. The suspension was 
frozen and thawed 4-5 X io break the cells [7|. A 
volume of (0.089 X WWi al 10% sodium dudccyl 
sulfate (SDSL including 33.3"--t- mcrcaptoethanol. was 
mixed with the sample and incubated 5 min on ice with 
(0.329 X WW) jiL of a solution of DNasc 1 (0.144 
mg/mL 20 nm Tris-HCI with 2 nm CACK X 211,0. pH 
8.8) and RNase A (0.0718 mg/mL Tris) [8.9J. The sample 
was frozen and lyophilized. Sample buffer [10] including 
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Figure J. Experimental flow chart showing main sicps of the prepara- 
tion procedures- The abbreviations used for nonenzymatic extraction 
procedures are: FZ: frozen sample preparation: NA. needle aspira- 
tion: SC. scraped: and SQ. squeezed sample. Extracted cells are then 
loaded as a suspension Hop volume of each tube) onto either 
L07 g/mL Percoll (left), or a discontinuous Percoll gradient from the 
nonenzymatic extraction I middle), or from enzymatic extraction 
(right). Cellular top- and interphase fractions are then used for 2-DE. 
For details see Section 2. 
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Large Scale Biology Corporation is the leader in the integrated discovery production 
and application of proteins - the functional units of all biological processes 

Large Scale Biology Corporation (LSB, Vacaville, CA) and its subsidiary Large Scale 
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to ,ndustry ,eaders in the fie,ds ° f diagn ° stics - th — ■ 

^^IS^^i^iSZ^ f al ° f commerciali ^'ng ^ proprietary GENEWARE viral 
vector system - a novel technology for gene expression. Using safe RNA viruses to transients 

c T?fi 9 t eneS m no "- recombina nt Plants, LSB has positioned itself h fte JSSr^toS? 

f-?^ manufacturing and purification of diverse protein and peptide products The 
IZV^ 0 1\ C ^ b l apP ' ied t0 the ex P ressi °n of libraries of foreign genes in an 
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for lIrI funrlinnS E andassoc| ated proprietary technologies form the basis 

o^vefopment 9 ' b,omanufacturin 9 and a variety of proprietary products under 

™li S !°" nda{i0n ' LSB unde rstood the need to integrate functional genomic and protein 
wor? S H U I' ng e J pert,Se Wi ! h quantitative protein analysis and informatics to become a 
world-leader in the prote.n field. In 1999, LSB acquired a privately held pharmaceutical 
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was founded ,n 980. Dr. Holtz was a co-founder and Director of Research fiorMF\)TL 
largest manufacturer of m.croencapsulated nutrients for agriculture and r^rtor of 
R^ht men / a, , ReSe o rCh ^^most-McKesson, Inc. Dr. Holtz received his Ph D in 
Biochemistry from Pennsylvania State University and served as Assistant Professor in the 
Department of Food Science and Nutrition at Ohio State University. rroressor ,n tne 

Daniel Tus^Ph.D.. has been an officer of LSB since he joined the Company in 1995 as Vice 
President Pharmaceut.cal Development. Dr. Tuse manages the company's pharmaLuticar 
design and development programs, including LSB's novel vaccines and^mSoTe^S^ 

"^T^^^S^- Dr J ? S t W3S AS5iStant Director of SRI "nCiC^SlSS 
Park, Calif.) Life Sciences Division. In his 17 years at SRI, Dr. Tuse developed extensive R&D 

se ™9 « intemarna^rofcnenfs 0 

ifr^i^HlT' 3 co : founder ° f LSB - Senior Vice President & General Counsel and 
Secretary, has served as an officer since 1 988. Prior to joining LSB, Mr. Rakitan was an 
attorney in pnvate practice. Mr. Rakitan received his J.D. degree from th^SS^^e 

T^^^^r'TT'' haS I*™ 6 * 5 C ° ntro,ler since 1988 and was elected as 
thS jl 988 and ZJt™ S A u P e ™ sor for Va ™n Associates from June 1 985 

T/r n . y • ' a S- d ? e also worked for Artnur Youn 9 and Co. (currently Ernst & YounG^ 

Guy della-Cioppa, Ph.D., is an officer of the company and currently serves as Vice President 
Genom.cs. Prior to joining the company in 1989, Dr. della-Cioppa worked for Monsanto 
Company in St. Louis, MO from 1984-1989 and was an NIH Postdoctoral Fe low aUhe 
Worcester Foundation for Experimental Biology in Shrewsbury, MA from 1 983 ^84 He 
received his Ph.D. in Biology from the University of California Los Angles 

Wi, ^ M -P fann P^6 Large Scale Biology in August 2000 as Senior Vice President Finance 
iSfiQ to" r"SS a ' 0ffiC6r ' Mr Pf3nn W3S formerl V with Prtaewaterho^ 
llll ?i "I* 2 °°K 0, m ? St rSCent,y 38 the Risk M ^agement Partner for the Western Reg on He 
served in a number of management roles at PwC, including leader of the firm's SiHcon Vallev 

PartneTof the lEST ' rT"* netW ° rking and ^mmunications bbZ^^O 
tht n!L. ^ e > n C a |,forn,a emerging business group, as well as Partner-in-Charge of 
he Oakland and Walnut Creek, California offices. Mr. Pfann received a B.S. Lgree from the 

GZSS^ Berkeley ' in Business Administration and an MBA in 
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experience in using 2-DE in agronomic research and in designing analytical software w 1 
and 2-D applications. He has held senior scientific positions in research 
institutes, in the U.S., France and the Ivory Coast. * research 

John Taylor Ph.D Vice President, Software Development and Bioinformatics Dr Tavlor is 

2!*^ ana,ytiCa ' S ° ftWare f0r automated 2-DE ! p2Sm 

analysis. Prior to joining LSB, Dr. Taylor served as computer scientist in the Molecular 

th" 3 ! 0 " 1 ^^ 09 ^ , at Arg ° nne ' and on the research staffs of the University of cSgo and 
Phi'Tf nSt,tUte of , P o ath0, °9y in Washington, D.C. Dr. Taylor received a B S in 

u£sity° m rSI * ° f S ° Uth Car ° ,ina ' and 3 Ph D - in Nuclear Phvsics ^om D^e 

?^nn*?! te !? er ' Ph D k CU o ent ' y S6rVeS as Vice President Proteomics Applications Prior to 
joining the Company Dr. Ste.ner founded and directed the Molecular Toxicology Grow % 
Novartis ,n Basel, Switzerland and was a member in several multi-discip^ary drug 

^^^ol^ a ^r^ neT reC6iVed Ph D " in ^cologWrmac^logy from 
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Leadership - Large Scale Proteomics Corporation 



Con5?ti^S2\^ ^Chairman President and CEO of Large Scale Proteomics 
Ph n n m , ( , i D - Anderson 0Dtain ecl his B.A. in Physics with honors from Yale and a 
Ph.D. in Molecular Biology from Cambridge University (England) working with M F Perufz as 
thoMM , Fe a™.* th6 n MRC Laborat0f y of Molecular Biology. Subsequen y he co-founded 
"'2 r An , atom y P^rajn at the Argonne National Laboratory <Chfca£> w£ere > h,s 
work in the development of 2-d.mensional electrophoresis (2-DE) and molecular datable 
technology earned him, among other distinctions, the American AssocS for CMn ca^ 
Chemis ry s Young Investigator Award for 1 982 and the 1 983 Pittsburgh AnaMical Chemistrv 
G«^Ll£?!im ^TT CO " founded LSP (onghrty Large ScM&j^^"*' 

Norman G. Anderson, Ph.D.,Chiei Scientist at LSP. Dr. Anderson has a distinguished record 

S^^otRl^a^"^ S T h i0 r P o°n Sjti ° nS at ° ak Ridge and Argonne' NaSonaT 
hf SI nd ANL) ' more than 300 scientific publications and the receiDt of mnrp 

E£S pre f s t H g,ous a , wards in reco9nition of his work in scien ^ and i^^rF? h to m0lB 

invention o the zonal ultracentrifuge, he received the John Scott Medal Award and fo the 
SXSJ^^^I B !° h chemische Anaiytik fur Klinische Chemie from D e 
? n h n ? ^ f h3ft ^ K,,n,sche Chem| e for the most outstanding analytical development 
in cl.n cal chemistry worldwide during a 2-year period. In 1984 ANL Awarded him rts career 
patent leader award for the largest number of patents issued to an employee At tauK? tte 

IZl^Mon and VrT™ " t6rmS ° f U S - Sa,6S and 

were $250 mrtlion and $1 million, respectively. Dr. Anderson received his deqrees at Duke 
University: a B.A. in Zoology, M.A. in Physio.ogy, and Ph.D. in Cel. Physlolog 9 ; He holds 1 8 



Spp ?SS? t Pres ' dent - Operations. Ms. Seniff has managed LSPs operations 

LSP f vJ Ln*H baC ; k h gr0Und ,nc "; des thirtee n years in international business prior to joining 
Wm, 1? ? T the e T P ' 0y ° f f0reign f irms - Ms - Seniff is responsible for helping 
fo Tsp ?n^^ P ! ment ^S. neSS devel °P ment a " d da tabase commercialization strategies 
Pornn^inn m c at '°« u' th mana 9 ement ° f LSP's parent company, Large Scale Biology 

UniverSy 3 d69ree BUSin6SS (W ' th h ° n °' S) fr ° m *° rida State 

Robert J. Walden, Vice President, Finance at LSP. Mr. Walden joined LSP in 1997 and has 

EZ^^^ 1 ** HS Pr6Vi0USly S6rVed as Vice President of Financeand 
w?r r a OS ' riS Therapeutics, Inc., and as Chief Financial Officer at the American 

(ATCC) - Mn Wa ' den r6CeiVed hiS d69ree in RnanCe fr ° m the 

^'n»"L H ° f ?? n ?' PhD ' Vice President, Software Development at LSP. Dr. Hofmann is a 

g2£2? Inn ^'"S, 9 * ha o Vin9 6amed 3 B S - in Biol °9y. M.S. in Biochemistry and 
Genet.cs, and Ph.D. in Plant Genetics from the University of Orsay, Paris. He has extensive 



*0f 5 



05/04/2001 8:20 AM . 



Biosource Technologies 

h!.p://wuu.lsbc.conV.nio/info. 

'eader in identifying and " 
new and more effective therapTes, diagnSs, ZXZ&S^l of 

"Proteomics" is the study of the entire complement of proteins expressed in a cell tissup nr 
organism. Proteomics can significantly improve drug Lo^^i^^iSS^ 0 ' 
fT ac t on n nf S H- ,S assoc,ate ? with '^balances among, or malfunctions of%SS? 8 On^ small 
fraction of d.seases can be attributed to the presence of a defective gene Unlike classic^ 
genomics approaches that discover genes that may relate to a d slas^ LSP has devllooLd a 
proprietary system called the ProGEx module for directly charactering ^J?a«Sated 
STrSSf 9 - S,n9 ^ S3me technol °gy- L SP can characterize l^S^^SSSSL 

l ^^t^Z SET and t0 determine the degree to which this s ^s^r 

«nHfnrl L n S ! a t a ? ^ many discoveri *s though an extensive portfolio of domestic 

and foreign patents and have developed commercial alliances and partnerships to exokrit the 
value of their technologies. LSB and LSP scientists and engineers ^^^S^S 
and appl.cat.on of resources to help clients meet their objectives as welt a th ^SS27of 
our own proprietary products for subsequent partnering with industry leaders deVe '° pment of 

A combined staff of 140 professionals operates from three locations in the United States with 
a network of collaborators and affiliates throughout the US and Europe Com D anv 
headquarters, R&D laboratories and its Genomics division are 'located I in VacS California 
22? in°n m,,eS ? rth6 l aSt ° f ? an Francisca Process development and Wom^lSctiSS^ 
&°S^ LSB * S Urge ^ P "~ «on suS* Ts 

i^sas ^i^j^s^^ a of 5 mi,,ion shares ° f c — 

Leadership - Large Scale Biology Corporation 

SSSf h E ^' Chairm , ar L of the Board and Chj ef Executive Officer/founded LSB ™ and has 

cS^^S^ To 1987> Mr " ErWin is the former chairman of the state of 
California Breast Cancer Research Council and currently serves on the University of California 

S ent A S r En A 9 ' neerin , g AdviS0ry Council - He is Chairman of the Supervise Board oHcon 
PreSnt ^ " CO : foun . der of Sun 9 ene Technologies Corp., Mr. Eavin JJed as V ce 
thp R^h ? SearC , h and Pr ° dUCt Devel °P^ent from 1981 through 1986. He has served on 
the Biotechnology Industry Advisory Board for Iowa State University. Mr. Erwin received his 

patent 69 ' 66 ™ ^ U ™ eTSi * and is an **^™£ZZ^ 



°™ d * McGee, Ph.D.,a co-founder of LSB and Senior Vice President and Chief Operatina 

oS^S^^^ 198 r Pri ° r t0 j ° ining LSB ' ° r - McGee Was • pSSS?oT 
3 C t r LSLSL ? ene l Technol °9'es Corporation from 1983 to 1987. Dr. McGee received his 

oln^r, lu ° m Lou J siana State University and served as a faculty instructor of zoology 
and genetics at Louisiana State University. *wiugy 

npSnmo' Ph D ' a co ' founder of L SB and Senior Vice President, Research and 
Development has served as an officer since 1987. Dr. Grill was the Manager of Plant 
Molecular Biology for Sandoz Crop Protection Corp. from 1984 to 198^^5^,* 
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experience in using 2-DE in agronomic research and in designing analytical software for 1 
and 1 2-D app cat.ons. He has held senior scientific positions in industry and research 
institutes, in the U.S., France and the Ivory Coast. researcn 

John Taylor Ph.D.,V\ce President, Software Development and Bioinformatics Dr Tavlor is 

analysis. Prior to joining LSB, Dr. Taylor served as computer scientist in the Molecular 
Anatomy Program at Argonne, and on the research staffs of the University ifio and 
the Armed Forces Institute of Pathology in Washington, D.C. Dr. Taylor rece ved a B sTn 

UnSr UniV6rSity ° f Car0,ina • 3 P " °- " Nucl ^SS: 

Sandra Steiner, Ph.D., currently serves as Vice President Proteomics Applications Prior to 
joining the Company, Dr. Steiner founded and directed the Molecular T«S^ Grouo ^ 
Novams ,n Basel, Switzerland and was a member in several multi-disc p^ dmg P 

^^^^^^^ r6CeiVed ^ Ph °- in ^i-logVh^ology from 
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